It all depends on your data and what you want to do with it. For most the most part it is a good goal to optimize your data up to a point where you feel it still is manageable.
We don't know what you are doing, so some specific advice wouldn't be helpful but you should try to determine what attributes make sense to keep within the main record and which on a different one based on the use and function that you'll give the data, not the performance and optimization gains (although it is a valid point in really large systems).
For example, if you have a PEOPLE table with the followind fields:
and you might want to add additional phones, you have diffferent approaches. You just create fields called People::PhoneWork, People::PhoneCell and so on. Or you could just create a Phones tables:
I storngly incline for the second approach (creating a new table) because querying and reporting on data will be much more easier. I don't know if that is what you are asking. Perhaps you're asking about attributes that all would arguably belong to a single entity. I can't really think right now of a scenario where you would reach FileMaker's technical limitations of 1 million tables and 256 million fields lifespan so I would always try to go for an optimized approach when doing data models.
As far as performance, there is probably a small hit when the database engine has to retrieve data from other tables but unless you're working with very large amounts of related records and a a lot of tables, I don't think it is a deal breaker.
Thanks for your reply.
Yes in the case of situations like your examples there is no question, I agree with your preference. I am a HUGE fan of flexibility where it doesn't get in the way.
Do I understand correctly that you think there is a more significant performance hit by separating data into two tables than experienced by loading more data than necessary because the fewer, larger table solution forces it?
In my particular case, I have a set of data that could be represented in a relatively fixed number of fields, though the fixed number of extra fields (those I'm planning to store in a separate table related by ID) would be in the hundreds; essentially a single table, chopped-off into a second after the commonly needed data. It is worth pointing out that this less-commonly needed data will only be accessed from a unique layout (or two). This is certainly not a question of hitting FM limitations, just what makes interaction with the DB the fastest and most efficient.
Please also note that my particular DB isn't likely to run into much difference in performance either way, due to its relatively smaller size; the question of performance and optimization just occurred to me as I was working. That said if my implementation is as successful as I hope that "smaller size" could change quickly.
Nov 18, 2011 9:48 AM (in response to vadev)
I think FM database not different from anyother dbs. In that sense, I would strongly suggest to follow Normalization atleast upto 3NF. Other approach, would be based on the application complexity, simple application no need to follow Normalization strictly. but for medium and large sized applications with multiple users, definately normalizing them will help in performance.
1 of 1 people found this helpful
Based on the scenario you described, I wouldn't expect much of a difference in performance if the file is opened locally or via a fast LAN. However, if FileMaker Pro (or FileMaker Go) has opened the file remotely via a WAN, you may see significantly better performance by using more, smaller tables.
When FileMaker retrieves any part of a record from the host, it retrieves the data for all of the non-container fields for that record. Even if a layout only contains a single field, FileMaker will retrieve every field for the record and cache the contents. Putting the less-commonly needed data in a separate table can prevent that data from being transferred until it is needed.
This type of optimization is likely more trouble than it's worth for a system deployed on a LAN, but it can make a big difference as the network latency increases with slower WAN connections.
1 of 1 people found this helpful
If you are developing something for WAN deployment, fields are expensive. Every record that FileMaker accesses, it pulls down all the data from that record. For found sets, it pulls down up to 25 records at a time. If some of that data is in another table, FileMaker only pulls down the data from that record if it is shown on a layout, or access some other way (like in a calc).
I suppose there can be arguements both ways. I personally try to find a balance between flexibility and performance. For example, for People. I usually have a People table ( ID, Name, etc ) and an attributes table ( Address line with a label, phone with label, email, etc ). By placing all of these attributes into a single table, reporting is easier, and I only have a limited number of fields. Instead of 9, 10 or more fields...in this scenario I only need 4 or 5 fields. Sorting and Conditional Formatting gives me the flexibility and power to display it however I want ( or I can get really fancy and use a web viewer ).
Every situation may be slightly different. You need to consider the environment, deployment, users, data load, and network performance. Test it, test it, and test it again. Like I once said to Mark Richman...we need to get from database theory, to understanding your database clearly.
Greg makes a good point. Local vs Internet can swing the decision.
In any case, you've got to try to quantify your net savings, then make a judgement call. For example (using these number I found lying on the kitchen floor), moving to split tables may save you half a second load times on the popular layouts, but cost you five seconds on the "unique" layouts. Will those "unique" layouts be accessed 10x less than the others? Will users really notice 1/2 second savings? Is it worth it in your developer time? etc
If you usea related table with information that have a million records, it can also be a slowdown to get the records to the portal since it have to use the realtionship, load the index and actually find the records. In some cases it can be a great speedbost to actually put several related recordinformations into a single texfield, created by a trigger that collects this information and list this in a field, and then a update field script changes this single field when a change in the relations is done. But of coures it depends on how the information will be used.
If you are showing a portal with related records, FileMaker has to download the data and structure of all those (25 at a time) records anyway. So there is little difference performance wise in a situation like that. This is where the discussion goes from structure to design. Developing for WAN, although best pratices for WAN should usually be employed in all development, requires that we design the interface to limit unnecessary network chatter. That usually leads to increased performance.
Wow, a portal with a million records. That's a lot of data to show in a portal. I know you were probably just using that as an example of what may cause a slow down, but if there are that many related records you would probably look at either not using a portal or finding a way to show a much smaller related set.
Fewer fields to load will ALWAYS be faster, but that may not be an issue on a fast LAN with a few users. However, any WAN connection or heavy load is going to pay dearly for extra fields.
Keep in mind that the 25-record chunks apply in Form View (single record!). In a tight list view it can be several times that much depending on the number of rows visible in the list; it will cache more than is visible, also in chunks.
Related tables for off-loading seldom-seen fields is good, but it's critical with long list views and WAN users. It's a bit more work, but well worth it if your network speed is ever less than instantaneous to users. Tests show that users hate visible slow downs -- what they perceive is much more important than what is real about the speed of displays and refreshes.
Also use stored auto-enter-by-evaluate values in place of unstored calcs whenever possible for field definitions so the system doesn't have to perform calcs to display data (unless necessary due to related values).
Wow, an excellent set of responses, thank you!
Yes, Stephen, Josh, Greg you have all hit on where I was headed and confirmed my suspicions. Smaller tables = Less data to transfer = Faster load-times. While my implementation will MOSTLY be used internally on a LAN, there will also be significant usage via WAN; both mobile devices and outside stations (for example when users travel home or away on business). I am certainly going to be my own worst critic here when it comes to lag-time and apparent slow-downs; give me (or in this case let me build) a tool that works, and doesn't make me wait. Also, 'in as few clicks as possible" is my general rule. A well-coded/designed back end can make the user experience SO much better. In my case I am also designing for some less-than-tech-savvy users, so clarity, ease of use, and intuitive user design are critical.
Putting in extra work ahead of time with things like this will save many multiples of the invested time later. Thank you all once again.
I believe the explanation you are missing is that regardless of what is actually visible in a particular layout, the entirety (every single field) of the record is loaded, and in chunks as described several times above. (Apparently there is an exception for container fields as mentioned by Greg; I did not know this!) FileMaker does not pick and choose only the fields required for a layout view to load to to the cache, it gets them all, thus more data to transfer and thus longer load-times. And, because every field is loaded, any calculation field in the record is also loaded and reevaluated each time that record is loaded. Stephen thus suggests that your calculations fields hold stored data whenever possible so the calculation need not be run every time FM pulls the record.
PS... "FileMaker" is not in the spell-check dictionary?! haha.
The first part of your post is correct, but this is not:
"And, because every field is loaded, any calculation field in the record is also loaded and reevaluated each time that record is loaded. Stephen thus suggests that your calculations fields hold stored data whenever possible so the calculation need not be run every time FM pulls the record."
Calculations are either stored or not. Stored calculations are evaluated at record creation or when one of the fields referenced in the calculation changes. Stored calculations are NOT reevaluated each time that record is loaded. Unstored calculations are entirely different. All that is stored is the calculation expression, no data. When the calculation is "called" (by viewing it, or referencing it in a sort, find, other calc) then the calc is evaluated.
This can be bad when unstored calcs are showing in a list, table, or portal view and 25 calcs need to be evaluated when loading a layout. Stephen is suggesting storing those calcs as text/number fields with an auto-enter calc to prevent that. That solution raises its own problems, though as the calcs are unstored for a reason, so they won't update correctly without outside influence.
Also, keep in mind Filemaker caches a lot of data. So the initial load of a set of records may be slow, but after that, it's just zippy. Users are more patient at certain points, like on startup. In some solutions, I load the commonly used layouts and a default set of records in the On Open script. Then when they switch to that layout for the "first" time, the data is there, already cached.
Appologies, I should have been more clear with my response, thank you for adding such.
And I like your idea of loading commonly needed resources OnOpen... though I imagine data-sensitive mobile users would appreciate a healthy dose of consideration here. But for work-stations and regarding added performance, a tempting idea.