1 2 Previous Next 24 Replies Latest reply on Feb 12, 2015 3:59 AM by NickLightbody

    Database with Massive Containers Data (Best Practices)

    Jason_Farnsworth

      I am getting ready to kick off a database that will have container fields with large files in them. This files are PDF's and can average 15-30 mb. There will be thousands of these types files stored in the database overtime creating a huge database. (~150 gb - ~500 gb)

       

      I have concerns that as the database grows it will slow over time creating a lethargic process.

       

      We use FM13 and Server 13 which is hosted on an exclusive server with nothing else pulling its resources.

       

      We should have at any point in time between 25-40 client users, 10-15 Go users and 10-20 web users.

       

      This database will be multi-function, managing many different departments at a time pulling core data throughout.


      What are the best practices for this type database?


      Thanks in advance,


      Jason Farnsworth

        • 1. Re: Database with Massive Containers Data (Best Practices)
          flybynight

          I would definitely go with external (aka "remote") container storage. That way, your actual database file doesn't get the bulk. This also keeps your backups more efficient, assuming the files don't change that much. Another advantage it that you can store the files in a different location, say a Thunderbolt RAID connected to your server, instead of the default location.

          You can choose open or secure storage, depending on your security needs. There will be a small performance hit for the encrypted storage.

          Check out the server documentation on remote container storage and/or the FTS.

          When you say "web" are you talking about WebDirect, or CWS? Either way, make sure you have adequate hardware and bandwidth for the number of users you will have.

          Good luck!

          -Shawn

          • 2. Re: Database with Massive Containers Data (Best Practices)
            Stephen Huston

            That will indeed be a biggie. The largest PDF storage file I've worked with on a live server was only around 4GB, but here are a few recommendations based on that:

            • Use external secure storage on the server. (The last thing you want with a big system is the risk of Tampered Data.)
            • Set the file's thumbnail settings to allow the file to permanently store the thumbnails it generates so it doesn't have to read the PDF data more than once per thumbnail.
            • Make the thumbnail fields as small as possible so the stored thumbnails don't bulk things up too fast inside the file.
            • 3. Re: Database with Massive Containers Data (Best Practices)
              Jason_Farnsworth

              Shawn,

               

              Thank you for the reply, I kind of expected external but the confirmation is great. I do like the suggestion of using a Thunderbolt raid, that should improve the data transfer time.

               

              The core information will be very modest in size to these container fields, remove them and the core data I project in several years to still be less than a gig.

               

              Security is a bit of an issue in that this database will be for a City Government that needs to allow public access and I feel WebDirect will provide sufficient needs for data input. That said, if I keep things simple and clean I can rapidly adjust to the rapidly progression of needs without a huge build time of a custom web interface.

               

              The large data files are building prints that needs to be handled digitally rather than the present method of buy another file cabinet. When you want to place 100 years of drawing files into the database for quick access it is a huge amount of digital data.

               

              With security in mind how do you feel the WebDirect is at security of container fields? In reading through the fourms it would appear there is a general concern that it drops the SSL for those fields. Do you think it wise to cull all WebDirect container fields into a separate database and link to them via an External Data Source. In this case the only data that is not secure over the web is the external database. Then pull (copy) them into the main database as needed. Am I worrying over nothing?

               

              Is there a rule of thumb for number of external users to upload bandwidth?

               

              Thanks again

               

              Jason Farnsworth

              • 4. Re: Database with Massive Containers Data (Best Practices)
                Jason_Farnsworth

                Stephen,

                Thank you for the advise, Yes big indeed, if I remove the these containers from the picture its brings the size scope into a manageable level.

                A missed place thought-line on data this big will present a real problem down the road a couple of years. Not one that I would like to explain.

                Your comments on thumbnails are well placed in that I have read and tested them as such and felt if not properly handled this could create a drag point.

                Thanks again

                Jason

                • 5. Re: Database with Massive Containers Data (Best Practices)
                  mbraendle

                  We did this once with a 25 GB FileMaker database (just holding text) and 1.3 TB of attached PDF files (about 900'000) files. The text was extracted from the PDF files and stored in FileMaker to make it searchable.

                  Since the database was available via CWP only for the public, there was no need to use container fields. Instead, we mapped the mounted drive with the PDF files via a Unix soft link into the web directory of the solution. In the database, a field was used to construct the URL to the PDF file. Display of the PDFs  (and hit highlighting) was done via web browser, using either a PDF plugin or the embedded display capability.


                  Don't know what happened to this solution, because I'm working now at a different place.


                  Details can be found at http://www.filemaker-konferenz.com/2010/downloads/27.%20Mai%20Donnerstag/Riesige%20Datenbanken/FMK%202010%20FileMaker%20…

                  • 6. Re: Database with Massive Containers Data (Best Practices)
                    FileMakerProRocks

                    You may want to take a look into SuperContainer at http://www.360works.com and store and reference to a dedicated server.

                    • 7. Re: Database with Massive Containers Data (Best Practices)
                      coherentkris

                      You might also consider moving the container handling functionality and tables to a seperate file. This will allow your main file to remain relativly small and only your container handler file will grow. Might make backups more efficient.

                      • 8. Re: Database with Massive Containers Data (Best Practices)
                        wimdecorte

                        Jason_Farnsworth wrote:

                        Do you think it wise to cull all WebDirect container fields into a separate database and link to them via an External Data Source. In this case the only data that is not secure over the web is the external database. Then pull (copy) them into the main database as needed.

                         

                        Purely from an architectural point of view it makes sense to store the container fields in their own table in their own file and just relate them to their proper record.  That can save you a lot of hassle on version upgrades (no import of container data...).

                         

                        From a security point of view, no need to put them in an external table and then copy over to the main database.  That's overkill.  Provided that your perimeter defense is good and that you store your container data with the "secure" option, not the "open" option.

                        • 9. Re: Database with Massive Containers Data (Best Practices)
                          Benjamin Fehr

                          You might also consider moving the container handling functionality and tables to a seperate file.


                          … and maybe use the transistor-technique where you fill in a key_field by any trigger to enable the relationship only when needed.

                          • 10. Re: Database with Massive Containers Data (Best Practices)
                            NickLightbody

                            I like the idea of storing the key in the local record but only inserting it in the local key field when one wants to access the container record in the second table - and then clearing the key field again after access - if that is what you mean?

                             

                            I guess that something like a title and summary would be required in the local record so that constitutes a small offence against normalisation but the benefit of removing all the container records from FM's known universe unless required sounds very sensible.

                             

                            Cheers, Nick

                            • 11. Re: Database with Massive Containers Data (Best Practices)
                              NickLightbody

                              But, from my experience of splitting into several files - over more years than I care to recall - I would suggest that good architecture in a single file is generally much to be preferred since it makes account management and solution development / maintenance so very much easier.

                               

                              From what I have seen, file splitting generally only creates benefits by say halving an inefficient Relationship Graph. Since the FM caching of the known universe carries a cost which accelerates as the graph grows, two half size files will perform better than the original single file, However, the better solution is to use smarter architecture and, indeed, smart techniques like the one you refer to in your second para.

                               

                              That said I have never actually tried a properly controlled direct comparison between a file containing a table with a big container load with the same file split into two with the containers in a second file - so you could be right!

                               

                              Cheers, Nick

                              • 12. Re: Database with Massive Containers Data (Best Practices)
                                NickLightbody

                                Ah yes, your first para isn't you it is from Kris, so that comment to him.

                                Nick

                                • 13. Re: Database with Massive Containers Data (Best Practices)
                                  Benjamin Fehr

                                  exactly.

                                  The pdf's will be permanently stored in the file, but like a switch, the relevant relation is to activate when needed (though not permanently).

                                  But to consider what's best, you got to know about rendering behavior of FM. I'm sure, someone will answer this question:

                                  - in a list view with a permanent relation to container-file, when accessing layout, is FM rendering all relevant objects, even when they're not on the layout?

                                  • 14. Re: Database with Massive Containers Data (Best Practices)
                                    DavidJondreau

                                    What benefit would having a ephemeral relationship have?

                                    1 2 Previous Next