4 Replies Latest reply on Oct 19, 2010 10:12 AM by keycoachjohn

    Middleword and word seperators



      Middleword and word seperators


      I've read a few posts on the board about different characters acting as word seperators. 

      Using FM10:  My task is to import a long string of words, seperate them into individual fields (8-10) which for 75% of records translate perfectly.  When a hypen has a numeral before and after it, FM treats as one word and all is well with my calc fields.

      Within the import file, there are cases where a hypen or space is preceded and/or followed by a letter and FM returns two words.  MD-11, CRJ-700. etc.

      Any simple suggestion to my FM calculation to link the "letter hypen number" into one word? Its nestled at the third middleword.  This condition occupies a minority of line items but hand editing the 10,000+ lines in Excel can get error-prone.  Thanks

        • 1. Re: Middleword and word seperators

          You could substitute § for your hyphens and then substitute again to put the hyphens back. This will cause all hyphens to be word separators so you'd also have to test for § while parsing the words so that the script can parse the next two words in those cases instead of one, then replace it with a hyphen.

          Anyone know of a different character that can be used this way but which doesn't break the words? (That'd be much simpler.)

          • 2. Re: Middleword and word seperators

            How about looking at it from the other direction?  If you know what word separators you do want to use, setup a Subsitute function to split the lines up into multiple values, then translate the values into fields.

            • 3. Re: Middleword and word seperators

              If you can solve it with your eyes then you can solve it with calculation.  You have a space between each 'word' in the string, right?  Then as etripoli suggests, turn your space into carriage return and set the fields with the values.  So it might look something like this:

              Import your string into one text field. Then use Set Field [] to set each field with this calculation:

              Let (
              values = Substitute ( string ; " " ; ¶  ) ;
              GetValue ( values ; 1 )

              The red is the field to set.  Change it to 2 for the second field and so on.  To verify that this will give you what you wish, create a calculation (result is text) first and view your results.

              • 4. Re: Middleword and word seperators

                Appreciate the feedback, interesting options to handle this. 

                I used the latter solution and it worked nicely.  It handled all but a few sneaky lines from the source docs where a space should have been a hyphen.  Limited consistency on the data entry side.  Since this occured about 10 times in the whole document I fixed it manually and re-imported.  Thanks again.