Fast list view & sorting vs. redundant data - what's your approach?

Discussion created by thirdsun on Jun 15, 2012
Latest reply on Jul 6, 2012 by thirdsun



I'm interested in learning how others approach the problem of working with large sets of related records in a WAN hosted solution. It should be noted that this particular DB is still on FM11 - we also work with FM12 in another solutions, but this post is not about the infamous scrolling issues some of you may have experienced with v12.


As an example I'd like to use a table in our solution that tracks stock movements / transactions by collecting sales, in- & outgoing shipments and so on. You can imagine that there are a lot of records - those are presented in a list view that mostly shows found sets as specified by the user. Users often use search criteria that is hard to predict or pre-define as it would need to cover a wide range of search filters and sort orders. Therefor I prefer a dynamic approach to this and let the user decide how filter and sort the records.


In the past, I stored data that would be relevant to searching, sorting, grouping and aggregating operations within the Transaction table via lookup values. This included data like product names, manufacturer, order options etc. - of course I know that this isn't the way to model a database since that kind of data should be left in their respective tables in order to reduce redundancy and maximize data integrity. I'd like to get rid of this approach but what is the alternative? Displaying related data like manufacturers, product names or store names causes the list be almost impossible to scroll. At this point I should probably mention that all my data is indexed. Furthermore I rarely use calculations at all (covered by updating fields via ScriptTrigger), but if I have to they are stored/indexed calculations. I prefer the idea of integer values and summary fields, so that I'm able to get totals from the same table based on the relationship and TO I'm using to query - this is very flexible.


The related data is indexed too and as you might imagine connected via foreign keys like id_product, id_store, id_lineitem and so on - no fancy stuff here. Yet scrolling is so slow once simple things like the product name from the Product Table is shown in the list. The same applies to sorting by related fields. There's no point in storing that data twice - well, apart from the performance boost. This however would ignore the purpose and idea behind relational databases and normalization.


I should add that every operation that just involves fields from the the Transcation table is fast as expected. We're talking about 200,000 and more records that are almost instantly sorted and aggregated. In fact even aggregating those movements for various sizes (in fact almost 20, the products are mostly shoes) from the products table works as expected. A product's stock for every store and every size is returned in about 1-2 seconds over WAN (about 10 MBit/s) by aggregating every in- & outgoing movement since the last inventory count (once a year).


Well, back to the performance problems - I'd really like to hear how you handle those situations we're large lists with related data are sorted, searched and aggregated - I can't imagine that I have to choose between a clean db structure and satisfying performance. What is your solution to get both?


Thanks for your time and input.