11 Replies Latest reply on Nov 29, 2011 1:32 AM by davehob

    Key formatting

    davehob

      I sometimes get problems with Finds etc. which would be avoided if they keys in the solution were formatted differently. For example, in a "people" table, the primary key was initially designed as an auto-enter serial no., starting with PPL1. We're now up to PPL9137, and in some situations I've been tripped up by, say, PPL913 appearing in a list, and a PatternCount returning 2. Obviously this can be avoided, by using "==" in Finds, including a carriage return when looking for keys in lists, etc., but it strikes me that this would all have been easier if the key had been defined in the first place as starting from "PPL000001", so that all values would always be unique.

       

      But then I wondered if there's an argument AGAINST doing it this way. Obviously there's the overhead of storing those leading zeroes, and it's also imposing an upper limit on the number of records. But, assuming that I'm comfortable with these considerations, is there anything else I should be worried about, before I set about converting the existing data?

       

      Dave.

        • 1. Re: Key formatting
          datastride

          Dave,

           

          I put two fields in each parent and child table:

           

          1. A numeric auto-assigned serial number field called “#”. (Oh, and I’ve never had a problem with this field name, even in CWP.)

           

          2. A calculated field “__kp_ID” defined as SerialIncrement( “Cont000000000” , # ), where the first 4 characters identify the table (in this case, “Contacts”).

           

          I use nine zeros, giving me a maximum of one record shy of a billion. (Should be enough!) I find this gives me all the flexibility I need for even very sophisticated applications. And the extra overhead of a few zeros is a drop in the ocean for the typical production database.

           

          Peace, love & brown rice,

          Morgan Jones

           

          FileMaker + Web:  Design, Develop & Deploy

          Certifications: FileMaker 9, 10 & 11

          One Part Harmony 

          Austin, Texas • USA

          512-422-0611

          • 2. Re: Key formatting
            Mike_Mitchell

            Dave -

             

            At the risk of starting the old "right way to do the key field" flame war again ...        

             

            There are lots of ways of doing a key field, from a simple serial number all the way up to a unique ID validated against a timestamp in base 36. It all depends on what you want.

             

            My personal "normal" method is to concatenate the record creation timestamp (as a number) together with the record ID of the record being created. (Based on a suggestion from Winfried Huselik {sp?}).

             

            I use a custom function for this purpose, but it's really very simple:

             

            GetAsNumber ( Get ( CurrentTimestamp )) & Right ( "0000" & Get ( RecordID ) ; 5 )

             

            calculated as a number. You'll wind up with something that looks like this: 

             

            6343993477500019.

             

            Again, just another way to do it, but it has some advantages over serial numbers:

             

            1. It's an auto-enter, so you don't have to worry about re-serializing the database if you import new records or you import production records into a development copy (which I end up doing a lot because of our software control procedures here).
            2. It's numeric, so it indexes faster.
            3. You can extract the creation timestamp if you want to (grab everything but the rightmost 5 characters, run it through GetAsTimestamp, and there you are).
            4. It's "practically unique", meaning that, for most applications, you can pretty much guarantee its uniqueness
            5. You don't have to worry about exact matches when you search (because it's numeric)
            6. There is no upper limit, because if the numbers roll around, then the timestamp will be different, so it still doesn't duplicate

             

            There are, however, disadvantages:

             

            1. A lot of nifty tricks you can do with text-based keys don't work with numeric keys (like return-delimited lists)
            2. It's long, and somewhat cryptic
            3. It doesn't embed the source record type, so you can't tell from looking at the key what kind of record it is

             

            To each his own; whatever works for your application!

             

            HTH

             

            Mike

            • 3. Re: Key formatting
              CarstenLevin

              Well, well, I know that this will cause some of you to find me very very conservative

               

              But, what is wrong with a simple number starting with 1, incrementing with 1 = 1, 2, 3 etc. Formatted as a number to make sure that 112 is not a part of 1112.

              And I would strongly suggest that you do not put any kind of meaning into the key.

              And I would keep any other metadata about the record in separate fields. Created TimeStamp, Created by etc. etc.

               

              Some rules to keep when deciding on how to format your primary keys:

              • The primary key must uniquely identify each record.
              • A record’s primary-key value can’t be null.
              • The primary key-value must exist when the record is created.
              • The primary key must remain stable—you can’t change the primary-key field(s).
              • The primary key must be compact and contain the fewest possible attributes.
              • The primary-key value can’t be changed.

               

              As in this example:

               

              Primary Key

              The field coud be named __kpln_person

              Name_FirstName_last
              1JohnDoe
              2Jeanned'Arc
              3JohnSmith
              4NielsHansen
              5KarlSchmiedt

               

               

              And as suggested in these quotations:

              • The AutoNumber datatype automatically increments the field each time you create a new record. While the number itself is meaningless, it provides a great way to reference an individual record in queries.

              or

              • As such, the primary-key value has no meaning to the user and no association to the data to which it’s assigned.

               

              Also have a look at this Techrepublic article or this one from DBMS

              • 4. Re: Key formatting
                datastride

                Carsten,

                 

                One reason to use a text field as a primary key is that you can do some powerful relationship stuff that isn't available with numeric keys. For example, you can have a single foreign-key field in a child record that can contain multiple keys, each from a different table (or different records in the same table).

                 

                I have a file that indexes a SuperContainer storehouse of documents, files, images, etc. Sometimes I need to link an image to (1) a Contact (photo of storefront), (2) an incoming email (where the image was an attachment), (3) a Subscription record (the store in which a service is installed), etc. And the list of tables to which an image can be linked can grow as new modules are added, but I never have to add a new key field to the child table.

                 

                Or say I have a workflow system where the basic element is a Task. A Task might come from (1) an Invoice, (2) a Sales Order, (3) a Contact, (4) a Repair Order, (5) a Tech Support Issue, etc. Again, a single field in each Task record can contain multiple foreign keys, formatted as a value list (e.g. “Invc000001234¶Cont000098765¶Sale000053876”). Thus the key for a Contact will not match any record in the Sales Order table, etc. This approach can really simplify things a lot …

                 

                But if you don’t need this flexibility, then a number field will work just fine. That’s why I always create two keys for each record: one as a number and one as text. I probably use the number version 90% of the time, but the other comes in very handy when needed.

                 

                Peace, love & brown rice,

                Morgan Jones

                 

                FileMaker + Web:  Design, Develop & Deploy

                Certifications: FileMaker 9, 10 & 11

                One Part Harmony 

                Austin, Texas • USA

                512-422-0611

                • 5. Re: Key formatting
                  Mike_Mitchell

                  Dave -

                   

                  See what I mean about there being lots of ways to do this?    

                   

                  Mike

                  • 6. Re: Key formatting

                    Sorry Mike, there is an error in your calculation.

                     

                    Please refer to <http://fmdiff.com/fm/serialnumber.html> for the correct version.

                    Get ( CurrentTimeStamp " returns a TEXT result, that's why you have to bracket it with GetAsNumber ().

                     

                     

                     

                    Morgan,

                     

                    There is no need to use a text key to match a text field with multiple entries (that was many years ago).

                     

                     

                    Winfried

                     

                    --

                    Check out the new FMDiff 2.0 to compare two FileMaker files and test for file corruption at <http://www.fmdiff.com

                    • 7. Re: Key formatting
                      Mike_Mitchell

                      Winfried -

                       

                      Uh ... I did bracket it with GetAsNumber.   

                       

                       

                      Here's what I wrote:

                       

                       

                           GetAsNumber ( Get ( CurrentTimestamp )) & Right ( "0000" & Get ( RecordID ) ; 5 )

                       

                       

                      with the calc returned as a number.

                       

                       

                      Or, alternatively, you can bracket the whole thing with another GetAsNumber, if you want to be more explicit:

                       

                           GetAsNumber ( GetAsNumber ( Get ( CurrentTimestamp )) & Right ( "0000" & Get ( RecordID ) ; 5 ))

                       

                      Mike

                      • 8. Re: Key formatting
                        CarstenLevin

                        Hi again,

                        We are also haveing secondary unique keys for specail purposes. But we never compromise on the primary key: No meaning, number, increasing one by one:-)

                        • 9. Re: Key formatting
                          CarstenLevin

                          And if you want combined keys, keys as text, with timestamps or even live calculations: My advice is to do so, but always keep the real primary key naked and clean: One unique number per record.

                           

                          The reasons are so many, and there is no reason for the FileMaker community to try to reinvent the wheel, here just a few of them.

                           

                          • No meaning - absolutely no meaning to the primary key.
                          • Unique and always unique, also if you should choose to sort* or search**
                          • Untampered and never never ever changed.
                          • etc.

                           

                          *Sorting on a primary key being number stored as text will give some very surprising results.

                          Example:

                          1, 2, 3 ... 111, 112, 113 ... 121, 122, 123 ... 2200, 2201 ... 1000101, 1000102, 1000103

                          will sort like this

                          1, 1000101, 1000102, 1000103, 111, 112, 113 and "2" will come in pretty late.

                           

                          **Searching for primary keys being a number stored as text will give some very surprising results.

                          When searching for a serial number, lets say "11" in a foreign key field will return those containing 11 if the numbers are stored as ... numbers.

                          Searching for the same serial number "11" in a foreign key field with numbes stored as text coud return numbers like: 11, 110 etc. etc.

                          You can or course choose to search for exact values, but now you have added not needed complexity to a very simple task.

                           

                          But this is not the same as saying that you can not have other unique keys per record stored differently, containing whatever you want, including birthdates, timestamps or whatever with a serial number. If you can find ways to utilize them, have them.

                          And if you need to use combined (primary) keys for relations, then set them in text-fields or whatever. If it is in joint tables though, you would still want to keep each (now foreign) key in ets own field also.

                           

                          • you still want to be able to use your primary key as text, then have it set on creation in an extra unique key field as a copy of the primary key but stored as text (do not use a calculated field).
                          • If for some systems (other than FileMaker) that you work with you need a specific lenght of the key field with, for example, 12 characters and "0"'s in front. like 000000000001, 000000000002 ... 000000000012, 000000000013 etc. then also do this as a calculation setting the text field while preserving your primary key.
                          • 10. Re: Key formatting

                            Mike,

                             

                            Yes, you did (my fault), but not enough for a numeric result of the whole shebang.

                            And I've abandoned my earlier version using Right(), since it does not save a lot.

                             

                                 GetAsNumber ( GetAsNumber ( Get ( CurrentTimeStamp ) ) & Get ( RecordID ) )

                             

                            Remember: key fields containing text can be unique for about 100 characters, while numeric keys can hold 400 digits. That's why numeric keys are more efficient.

                             

                             

                            Carsten,

                             

                            As a general rule: NEVER do a Find or a Sort on a key field whatsoever. If you find yourself doing so, there is something plain wrong with your logic. Keys and Find utilize distinct indices (value and word index). While a value index uses the first 100 bytes of a field, regardless whether printable or not, the word index uses words separated according to the index language rules.

                             

                            Please see <http://fmdiff.com/fm/recordindex.html> for an in depth explanation.

                             

                             

                            Winfried

                            • 11. Re: Key formatting
                              davehob

                              Well, that gave rose to more discussion than I expected!  Thanks for your guidance and an interseting exchange.  (It's raised questions as well as answers, particularly Winfried's advice never to do a Find on a key field on a key field.  That happens all the time in the solution I'm working on, which suggests I have issues of which I was unaware, and I'm already a bit worried about how to avoid it.  Presumably GTRR is the correct way most of the time, but I can think of situations where that won't work.  Ah well, upwards and onwards!)

                               

                              Thanks again for the input.

                               

                              Dave.