1 of 1 people found this helpful
Having seen just a snippet, I can't give you precise instructions, but it appears you have a text parsing exercise. I would suggest a process along these lines:
1) Determine if you have a string that can serve as a delimiter between messages. Perhaps you can use something like:
If all the messages begin with that, then you can use it to mark the beginning of a message.
2) Using the Position and Middle functions, loop over the text in your Ginormous Monolithic Text Field and separate out the messages into individual records in another table. Just go from one occurrence of your delimiter to the next.
3) You can use a similar process using the string Sent: to extract the date information (i.e., anything between Sent: and the next carriage return).
The ugly part would be trying to extract a date that's embedded in the body text. This is especially true if you have a variety of formats involved. For example, if you have one person who says, "21 March 2012", another who writes "3/21/2012" and another who writes "March 21st", you'll have a mess on your hands. (I haven't seen the data, so I can't say for certain that's what you have.) But one thing you can try is use a return-delimited list of all month names, like this:
and so forth. Loop over that list and see if you can detect any instances of those words in the body text. If you find one, check to see if there are any 2- or 4-digit numbers in the range you need (13, 14, 2013, 2014). Flag those records for later examination.
Someone else may have a better idea, but trying to extract variably-formatted text from a random jumble is usually a matter of trial and error based on what's in the particular data stream.