Some ideas that pop to mind:
try to export from word or web file as tab delimited or comma delimited from word/excel/numbers, you can then import into FM via import.
... this is of course assuming it is a one time import. (ie import and dump word file).
If your OS is apple then you can use applescript to parse your text and import into Filemaker. If in Win then you will need to use their native scripting app.
It might be more helpful to let us know what your Operating System is and if this is a bidirectional (read/write) or on directional (read and import only) task.
If you're on a Mac, there are ways to use AppleScript, as MSpsi mentioned, to extract the text from (many) PDFs or Word documents. It is not that hard, but somewhat geeky.
Another option, which is often my first choice, is to use a text editor. I use BBEdit, but there is also its free sibling, TextWrangler. Both are powerful, stable, and fairly easy to use. You would use these if the text you have is "space-separated", that is, there are mutliple spaces between "columns" rather than tabs or quote-commas; or some similar problem. This is common with "tables" copy/pasted from web sites.
The advantage of text editors is that most support "grep", regular expression pattern-matching. It is not hard for simple things (I'm going to put the pattern in quotes, but in real life you wouldn't use the quotes for these):
Find: " +" (2 spaces plus, means "2 or more consequtive spaces)
Replace: \t (tab)
There are also some free text editors for Windows. They're not quite the same, but all text editors are similar. Grep is cross-platform and fairly similar, though there are some variations.
Hey all, I am on a Mac,,
I used an online convertor but all that did was create a XLS file that put all the info into one columb.
It looks like this:
I also have it in a way so I can use it as text, I was able to copy just the text but I remember 8 years ago a way you could do this - and I can be geeky (LOL), but I can't remember and I do not want to enter each record one at a time, that will take quite a while.
Any hand holding appreciated!!! Help, explaining. et-cetera.
35Sound P O Box 217 Paci!c Palisades, CA 90272-0217 PHONE 310-454-1280 FAX 310-230-0132 WEBSITE 35sound.com G. Marq Roswell MUSIC SUPERVISOR PHONE 310-454-1280 FAX 310-230-0132 EMAIL firstname.lastname@example.org Adam Swart MUSIC SUPERVISOR PHONE 310-454-1280 FAX 310-230-0132 EMAIL email@example.com A.J. Gundell Productions 80 Sugarloaf Dr Wilton, CT 06897-2119 PHONE 203-984-1409 STYLE MusicSupervisor Andy J. Gundell MUSIC SUPERVISOR PHONE 203-984-1409 EMAIL firstname.lastname@example.org Abbey Entertainment P O Box 1626 Studio City, CA 91614-0626 PHONE 818-755-3942 STYLE NoUnsolicitedPhoneCalls Please Stephen Smith MUSIC SUPERVISOR PHONE 818-755-3942 EMAIL email@example.com Matt Aberle 2418 Wild Oak Dr Los Angeles, CA 90068-2561 PHONE 310-717-3195 Matt Aberle MUSIC SUPERVISOR PHONE 310-717-3195
Here is a snapshot of what the real PDF looks like.
I would really need the actual PDF to tell you whether it could be converted to text successfully.
This is how I do it. I have downloaded and installed a small command line tool called "PDFtoText", from this page (you need to read what it says about it, such as "If you don’t know what “Terminal” is, please do not download this package"
But really, it is not that hard to use, once you put it in the correct place (which for my linked AppleScript must be: /usr/local/bin/pdftotext ). Then it can used from AppleScript. It must be saved as an Application (with no options checked). You can then drop PDF file(s) on it. It will extract the text, creating a text file right next to the original PDF file. This is my AppleScript droplet using it. You MUST download and install the above Unix tool first, in the location I specified above.
There is also an Apple Automator action for PDF to Text. But it does not work near as well as this old command line tool (go figure). There are several addition options for the tool, such as text encoding used.
P.S. You could email the PDF to me, my first name @ my web site domain (fentonjones.com). Trying to avoid the evil spiders.
OK. I tried it. Extracting the text from the PDF was easy enough, using my AppleScript. But then, as you discovered, it's a jumble of data and spaces. I was able to use a text processor (BBEdit or TextWrangler) to gradually "munge" the text into something resembling an HTML form, ie., LABEL: Value, each on its own line, different "records" separated by a blank line.
This took many different grep operations, especially the darn ROLE:s. I used much the same grep expression, just pasting in whatever new "role" I saw. The addresses required a few manual additions of ADDRESS:, especially those weird English ones. So I can't really tell you all the grep used. All of it fairly simple however.
That is really the best that can be done with this kind of data (well, XML taged would be better). It would be a real PITA to make it tabbed or comma-separated, because the data is variable. Entries have more or less data; arranging it into fixed columns would be different. Grep is not good at that.
FileMaker on the other hand, is pretty good at looping thru a bunch of lines, and deciding what to do with each. So I would import this entire text into a single field, could be a global field. Then a script would go thru it, line by line, using GetValue(incrementing number). If it runs into a blank line, then it's time to create a new record. It would then extract and look at the LABEL: to see what field to set the Value into.
I have such a script here somewhere. But I'd have to look for it, then adapt it to this.
I looked at my earlier (FileMaker 5) Loop parsing files, created before there was even a GetValue() function. What a bunch of work. So I created a new FileMaker 10/11 version. By using the SetFieldByName step it makes it fairly easy. Obviously the "label:"s must match the field names.
Since I have these fields, I will have to rename them to something else then align them correct?
What script did you use as I would like to take a look at it.
Well, you could rename them, but then you'd have to change the label:s in the text also. So unless you know how to safely do that, no. The field names MUST be exactly the same as the "label:" text in the data you're parsing. Otherwise the script will not wor and will not set anything. You could just leave this file as is, for a general parsing file, and just Import the fields into your real file.
The script is the only script in that file. It is not really a beginners' script, but it is not that complex; pretty standard for Looping thru a large block of text. It goes thru the large global text field, incrementing a counter, to get 1 line at a time. If the line is empty, then it creates a new record. If not, then it attempts to parse the line.
It first pulls out the Label: (the first text with a colon after it), and puts it into a script Variable. Then it does the same for the Value (the text after the label). It then attempts to set a field named the same as the Label:
So it does not really matter what order the labelled lines are in, nor how many of them there are per record.
* Set Field By Name is new script step, added in FileMaker 10. It is ideal for situations like this, parsing labelled text, then setting a field named the same. It makes for a much shorter script. You should see (not really) the FileMaker 5 version of this.
It also requires the table name for a complete path to the field (scripts always need the complete path). The script got that earlier, and put it into a $table variable, using the Get (LayoutTableName) function (you could also just hard-code it as text).
Fenton, okay, I looked at the two files - great work and on the FP7 file, I hit PARSE file and it created the 629 records.
So how would I import this into the data base I created?
If I created fields and changed the name, could I not open the txt file and do a find and replace for the names, or could I import the txt file as is? I guess I am trying to figure out how you created that PARSE file in the FP7 file.
Also, is there a way that since we now have tabs that we have say 17-20 records total that are for song writers but when I click on Music Supervisors or Contacts, or whatever I decide to call it, it shows 17 records but when I select the CONTACT tab, it says 629 (or something close to that yet have them all open?
When you say SAFELY change the labels in the text, what is it you mean? I am assuming that you cannot just FIND and REPLACE, as you make it seem like, I would not create the correct (Manage Databaase),
I suppose I could include my latest file and you could have a look at it.
Thanks - let me go look for that file and then when you have the time please check it out and tell me what you would do as I need to have a DB with A)Writers and B)Contacts and I don't want my boss getting confused as when we build up all the songs, I want him to know exactly how many songs he has vs what are just contact files, so I guess I am confused as about how to go about doing this.
Thanks so much!
how does one add a file to a message on this board the way Fenton does?
Upload the file to a file sharing site and then post the download link here.
Hubby sending Fenton a file - for some reason the connection here is slow.
Thanks all. When I get home or in the next few days I will post a link to the file!
Hi everyone - -it's been a very busy week and I want to thank all of you for your help. The bottom line is, I have a new job and am still in that period where it's not permanent yet, so, I have been having people I know, plus a few from here trying to help me.
So, I am including the following:
Contact Management Template (a little edited, photo removed)
A music Supervisor Text File
And a PARSE Music Supervisor FileMaker File that as soon as you hit PARSE, it puts them in the file, the problem is, I need to use the Contact Management template and put the music supervisor information in their while at the same time learning about how to PARSE as I am getting a few more production templates in PDF form, such as TV/FILM Guide, Production House Guide and so on and really need to get how to convert PDF to TEXT and them either tab delimit or XLS and input the contact info into the CONTACT MANGEMENT Template.
I appreciate everyone's help, so thank you very much.
Here are all the files:
<a href=http://www.filedropper.com/textandfilemakerdocuments><img src=http://www.filedropper.com/download_button.png width=127 height=145 border=0/></a><br /> <div style=font-size:9px;font-family:Arial, Helvetica, sans-serif;width:127px;font-color:#44a854;> <a href=http://www.filedropper.com >file backup online</a></div>
2nd way to retrieve: