1 2 Previous Next 23 Replies Latest reply on Jan 24, 2014 11:10 AM by oldal41

    Do Number fields work faster that Text fields for searching and in relationships

    AlanStirling

      Hi all 38,000 Technetters!

       

      Question: Do you expect fields set as number, when used in searches and/or relationships to perform faster than fields set as text?

       

      Does anyone have any accurate practical test results to support this supposition?

       

      From my point of view, based on theory, I believe that there is no difference in the size of the full index of a text field or a number field with the same number of characters entered into each. FileMaker stores numbers as characters, in exactly the same form that they were entered by the user (hence: 'Display as entered'), so if one indexed text field holds 'ABC' and another indexed number field holds '123', why should any searches or other operations be faster when using the number field? If you view the index of each of these fields, you will see a list of the characters entered, whether letters or numbers. Why should there be a speed difference in searching text or number fields?

       

      Perhaps someone can clarify the principles involved and others may be able to show practical results, demonstrating the speed benefit of using number fields (that has already been championed again earlier today in another thread).

       

      Best wishes - Alan Stirling, London UK.

        • 1. Re: Do Number fields work faster that Text fields for searching and in relationships
          jbante

          Yes. I don't claim to understand why, but in my testing, number fields have better find performance. When FileMaker Go was first released, I started comparing different UUID functions. I posted an update on those results after FileMaker 12 introduced the Get ( UUID ) function. Number fields were still faster when performing finds. (This is why I'm sticking with my numeric UUID functions for now.) The file size was significantly smaller, too, implying more compact field storage, more compact indexing, or both. (This included numeric data containing equivalent information to hexadecimal strings, where the numeric data use more characters to display with base 10.) I don't think I've compared file sizes for each with indexing turned off vs. on, which would be informative. Since my testing was focused on finding good primary key values, I only tested indexed fields. The equivalent information stored in a text field at base 65535 approached the storage performance of numeric data, but not the find performance.

          • 2. Re: Do Number fields work faster that Text fields for searching and in relationships
            PeterWindle

            It would make sense to me that a field which can store only numeric characters (0 through to 9) would be much quicker than a text field which would be able to store a very large amount of character types, the whole unicode set, I would imagine!

            • 3. Re: Do Number fields work faster that Text fields for searching and in relationships
              AlanStirling

              Hi Peter

               

              What I am saying is that FileMaker stores numbers using the same space on disk as any other (low ASCII) character, so "123" takes up exactly the same storage area on disk as "ABC". In Hex, '123' is '313233' and 'ABC' is '414243' - each is three bytes.

               

              So knowing this, why should number operations be faster than any character operations?

               

              Does anyone have any practical test results that can prove this one way or the other?

               

              Best wishes - Alan Stirling, London UK.

              • 4. Re: Do Number fields work faster that Text fields for searching and in relationships
                PeterWindle

                I get what you're saying, but surely filemaker would have dsigned the field type to accommodate only a certain range of Hex character though, right? Therefore, this would change the index behaviour?

                 

                 

                Looking at filemaker's tech specs:

                 

                • Text: Up to 1 billion characters per field per repetition (limited by available memory) with optional text style runs and paragraph runs. Index is based on the first 100 characters of each word or value.
                • Number: Support values from 10^-400 up to 10^400 and the negative values of the same range. Index based on the first 400 significant digits. Up to 1 billion characters per field. The first 400 digits are indexed.

                 

                There are some obvious differences in the design of these field types, even though the character limits are the same...I suppose this is what makes all the difference??

                • 5. Re: Do Number fields work faster that Text fields for searching and in relationships
                  comment

                  PeterWindle wrote:

                   

                  surely filemaker would have dsigned the field type to accommodate only a certain range of Hex character though, right?

                   

                  You may have noticed that Filemaker will happily accomodate any type of text in a number field - at least as far as it concerns storing. When it comes to indexing, the story is completely different.

                  • 6. Re: Do Number fields work faster that Text fields for searching and in relationships
                    jbante

                    AlanStirling wrote:

                     

                    What I am saying is that FileMaker stores numbers using the same space on disk as any other (low ASCII) character, so "123" takes up exactly the same storage area on disk as "ABC".

                    Perhaps without indexing turned on. With indexing turned on, digits (more even, up to a point) stored in an indexed number field will definitely result in a smaller file size than storing the same digits in an indexed text field.

                    • 7. Re: Do Number fields work faster that Text fields for searching and in relationships
                      Stephen Huston

                      I believe FM uses different indexing for text and numbers. Numbers are indexed as a single string, being one number. I believe text fields are indexed both at the field and word level, though it's not always clear exactly how these are different.

                       

                      When setting field indexing in the field definition, you have different options if the field is a number than if it is text. Dates are also different than text. With dates and numbers indexing is either On of Off. With text you can set it to partial or full indexing. I suspect that is why text indexes may be larger.

                      • 8. Re: Do Number fields work faster that Text fields for searching and in relationships
                        alistairhay

                        General curious question about this;

                        Why is there a difference between text and number for the number of characters/digits indexed?

                        That is, why first 100 characters for text vs first 400 significant digits for number ?


                        If, as Alan ascertains, '123' = 'ABC' there shouldn't be the need for the discrepancy in index lengths.

                        • 9. Re: Do Number fields work faster that Text fields for searching and in relationships
                          ch0c0halic

                          They aren't the same.

                           

                          Text is indexed as using weighted Unicode so language specific characters are correctly identified. "ABC" in English is not the same in Japanese, or Thai, etc.

                           

                          Numbers are just number.

                          • 10. Re: Do Number fields work faster that Text fields for searching and in relationships
                            oldal41

                            FileMaker Pro 11 Advanced

                             

                            iMac Mid 2011,  3.4 GHz Intel Core i7 16 GB 1333 Mhz DDR3

                             

                            2,500,000 unsorted Records SSN = 9 Numbers Alpha = SSN with 0 = a; ... 9 = j

                             

                            Sort test results:

                             

                             

                            Script Used: 

                             

                             

                            Sort and Time

                            Unsort Records

                            Set Variable [ $$StartSSN; Value:Get ( CurrentHostTimeStamp ) ]

                            Sort Records [ Specified Sort Order: Number v Letter Sorts::SSN; descending ] [ Restore; No dialog ]

                            Loop
                            Pause/Resume Script
                            [ Duration (seconds): 1 ] ExitLoopIf[Get(SortState) = 1]

                            End Loop

                            Set Variable [ $$EndSSN; Value:Get ( CurrentHostTimeStamp ) ]

                            Set Variable [ $$TimeSSN; Value:$$EndSSN - $$StartSSN ]

                            #

                            Unsort Records

                            #

                            Set Variable [ $$StartAlpha; Value:Get ( CurrentHostTimeStamp ) ]

                            Sort Records [ Specified Sort Order: Number v Letter Sorts::Alpha; ascending ] [ Restore; No dialog ]

                            Loop
                            Pause/Resume Script
                            [ Duration (seconds): 1 ] ExitLoopIf[Get(SortState) = 1]

                            End Loop
                            Set Variable
                            [ $$EndAlpha; Value:Get ( CurrentHostTimeStamp ) ] Set Variable [ $$TimeAlpha; Value:$$EndAlpha - $$StartAlpha ]

                             

                            Looks to me that thadata type does not matter that much.

                             

                            • 11. Re: Do Number fields work faster that Text fields for searching and in relationships
                              AllegroDataSolutions

                              Yes. In my experience FMP (as well as every other database I have ever worked with) can perform quicker with numbers if your choice is using a number field or a text field that will hold numbers. Numbers sort first in ASCII and other code systems, so theoretically they can be found faster. Indexes that are limited to a smaller character set like 0,1,2,3,4,5,6,7,8,9 should also improve the speed. There may be an advantage to using numbers as key field values to avoid situations where two records could have the same value, too. Since relationships in FMP are based on indexes, I suppose a case could be made for the relationship being "faster" as well.

                               

                              That said, lots of other issues affect the actual and perceived performance of a solution. Hardware, software, how you use relationships, etc. In small databases these will have much more of an impact than your choice of field types.

                               

                              When I first started out as a developer, I primarily worked on updating or rebuilding solutions that had been created by others. Based on my observations of what the best practices seemed to be, I established a few simple rules for myself, which have served me well over the years. From time to time I have even come across books and software manuals that advised this approach, or some variant of it, as well:

                               

                              - Use serial numbers for primary keys in relationships, unless there is a compelling need not to.

                               

                              - With regard to other fields which will only contain numbers, use the number field type for keys, or if you will need to do math on their values, or if you will need to search on them a lot. Otherwise, use text fields.

                               

                              Of course, there will always be exceptions, but the above rules of thumb are a good place to start. When I find that I don't have a good reason to break them, it's usually an indication that I should stick to those rules.

                               

                              I hope this helps.

                              • 12. Re: Do Number fields work faster that Text fields for searching and in relationships
                                breezer

                                Based on my experience and also from Jon Thather DevCon presentation on "Under The Hood: Pro and Server Performance" I would say number fields are better than text fields for searching and for relationships. Somewhere in Jon's presentation, he summarises it suscintly: "Each digit in a number field  = 1/2 byte in index entry whereas each character in text field  = 2 bytes in index entry. Larger index = Larger index to read from disk"

                                • 13. Re: Do Number fields work faster that Text fields for searching and in relationships
                                  Stephen Huston

                                  Thanks for the info from Thayer. That means a text index on the same number of characters will by 4X as large as a number index. -- Worth knowing.

                                  • 14. Re: Do Number fields work faster that Text fields for searching and in relationships
                                    DavidJondreau

                                    Just an Unsort[] may not be a good test. You should probably close and reopen the file between tests. FileMaker may be caching all the record data when doing the first sort, making the second run faster than it would without the cache.

                                    1 2 Previous Next