What version of FileMaker are you using? There's a new function in FileMaker 12 specifically for extracting xml data from a web site.
With older versions, IF the content is accessible, you can use GetLayoutObjectAttribute with the "content" option to extract the entire contents of the web viewer into a text field and then you can parse your data from the text field. This requires giving your web viewer an object name and you have to wait for th web viewer to update before using this function to extract the content.
I am currently using FileMaker Pro 10 Advanced.
What is the new function in 12 that addresses this issue in the Windows environment?
With my current version, because I am using Windows with Internet Explorer (required by FileMaker), the data within the Web Viewer does not appear to be accessible. I tried to use the GetLayoutObjectAttribute with the "content" option to copy the contents to a text field but it does not appear to be working - no data is ever passed to the text field. I will double check that the web viewer is updated before the GetLayoutObjectAttribute function is called. I will insert a Pause function within the script to test this theory.
Thank you for your response Phil!
I had to look that one up. It's not quite what I thought.
It's a script step called: Insert From URL [Select; No dialog;; Resource URL]
And it doesn't specfically intended for xml data as the examples illustrate inserting other data such as a PDF from the website into a container field.
As far as I know, this all works the same in Mac or Windows. The web viewer does use the "Explorer web kit" on windows and the "safari web kit" on macs, but I would not expect that to make a difference here.
I've used GetLayoutObjectAttribute on windows machines to "scrape" data from a website myself--so I know it works, but don't know if there are ways that a web site to present the data so that it is not accessible in this way.
Can you copy data from the web viewer and paste it into a text field? I'm not suggesting this as a work around, but as a test to see if the data shown is accessible--if it's a bitmapped image of the data then you may be out of luck.
Thank you very much for your help! With your suggestions I have finally written a script that copies the data from the web viewer and pastes it into a text field. Now my next task is to parse the exact data and paste it into the target field.
Following is the section from the Web Viewer object that contains the ZPID that I am seeking to parse:
However, when I view the source code that is pasted into the text field, the code looks like this:
<SPAN class="m"><</SPAN><SPAN class="t">zpid</SPAN><SPAN class="m">></SPAN><SPAN class="tx">22267614</SPAN><SPAN class="m"></</SPAN><SPAN class="t">zpid</SPAN><SPAN class="m">></SPAN>
Therefore, I'm guessing that I will have to add some lines to the script that isolates the ZPID. Can you direct me to an easy solution for this?
Isolating data from such a morass of html tags is quite a challenge and you are at the mercy of the web page designers/managers. If they change the format output to their webpage, your calculation could no longer work.
In this case, the only numerical data is the zipid value: 22267614
If that's truly the only digits in the text you are scraping from the web page, then GetAsNumber ( TextField) will extract the number and ignore all other data.
Filter ( TextField ; 9876543210 )
would also work.
Again, thank you very much for your help! I have successfully completed the script to retrieve the data from Zillow, parse the property id from the the data, and post it to the proper field!!
As you stated, the script step that worked for me is as follows:
GetAsNumber ( ParseData ( GetLayoutObjectAttribute ( "ZillowWebViewer" ; "content" ) ; "ZPID" ; "ZPID" ; 1 ))
What amazed me is that I don't need to copy the Web Viewer content to a text field to parse the data. Now I'm moving on to retrieving more data from Zillow!
I'd still use GetLayoutObjectAttribute to pull the data into a text field and the parse the data from it for two reasons:
1) It makes it easy to examine the "raw content" for issues anytime one of the text parse operations fail to see why it failed.
2) For extracting multiple items from the viewer, it should execute a touch faster as the GetLayoutObjectAttribute function is called only once.
That is awesome! I have been wondering if there was a way to convert the html back into xml. That will make the process of parsing the data considerably easier - especially for text based data, which I am about to tackle when I query comparable properties.
Thank you very much!! I know that your custom function will be heavily used!
Using the text field to parse the data has already proven to be right move! As I moved on to parsing the Zestimate I was having some difficulty until I took a closer look at the source code. Thank you for that tip!
I didn't realize that calling the GetLayoutObjectAttribute function more than once would slow down the application. I had a major issue to the speed of my application in the past due to calculations and I don't want to have a repeat of those problems.
I have already moved onto the next set of data to parse for this application. Due to your help, it has gone considerably faster and easier! I realize that I also need to code in a loop to check for data in the Web Viewer instead of using a set time utilizing the Pause function. I have found that the first API call responds fairly quickly and consistently. However, the second call can take anywhere from 2 seconds to 5 seconds. Additionally, not knowing the speed of the internet connections for any future users tells me that I would be better off scripting the loop to check for data in the Web Viewer.
Again, thank you very much for you help - it has been invaluable!
To be honest, I haven't scraped data from the web viewer enough times to know how much of a performance hit might happend due to repeated calls to the function--the keep a copy of the source for examination aspect is far more important. I just try to reduce the number of times an identical calculation with identical results is performed as a basic "best practices" habit.