oddly enough, I've dealt with Sonic's API before to do some fooling around. They have a melody analyzer API as part of it:
Now, I'm not saying it's easy, but it SHOULD be possible. It would probably be much easier if you wrote a sync that pushed the recordings to a desktop version of a "processing" database. The rough steps would be:
1) record samples in the field, push recording data to desktop version "recordings" table.
2) Write a process for uploading the file via the API. Possibly ScriptMaster, SmartPill, FTPeek or MonkeyBread are plugins that could do this.
3) Apply the analyzation from sonic to convert to a pitch contour
4) Download the processed result code, and parse to a data block in FileMaker.
5) Create comparison score calculation in FileMaker that sort of checksums the sample against a pre-existing database of data blocked samples.
This is ambitious, just the sample database to compare against is a monumental task. But it *should* be possible.
If you do this, I would most likely be your first customer. I know a few birders.
I'll fast forward to the challenging aspect to creating something like what uv described.
If I understand correctly, he wants to input sound recordings (clippits) captured in the wild and wants to build some logic which would analyze and id the specific animal he is targeting, presumably in hopes of linking it to various attributes about the environs / event which would also be captured / logged.
If he just wanted to be able to flag audio files manually and use that flag to key the related data (e.g. IsTargetSound=1), you would be in easy territory. U want the waveform to be translated into some data which could then be calculated against and auto determine whether a match exists.
If that is the case, I would say u t looking at :
Field audio captured on go.
Resulting audio clips become the payload of a script which , on the server side, pushes clips to a waveform editor, analyzer which has the ability to return the data which it has produced for each clip.
This data must be representative of an attribute which fingerprints the specific target u r after. There are a long list of potentials which exist in the world of forensic audio but keeping it simple , let's say that ur target bird has 3 common frequency spikes which are generally found in its emitted sound. Your return data would then have a measure of these three frequencies and you would test those numbers against the numbers expected in target bird....ifYes, you would set the audio clip's flag to a 1.
In reality , u r probably looking at multiple conditions needing to be met to trigger a pos id.
There's a start down the road I would look into....best of luck !!
One key to this is Fast Fourier Transformation (FFT), which transforms your time-dependent waveform to a time-dependent frequency spectrum.
If you are adept at programm in OO languages, you should have a look at Processing (www.processing.org).
This language, made originally for computer artists, comes with a development environment and many examples. Packed with this is the Minim audio library, which has also a FFT algorithm.
Version 1.5.1 allows to export a Java applet that can be run in a WebViewer.
Version 2.1.1 allows only to export a Java runtime.
In the File > Examples... menu of the Processing development environment, you find in Libraries/minim two examples that demonstrate the use of the FFT algorithm: AnalyzeSound and SoundSpectrum.
I have attached two screenshots that show how tiny the code is to calculate a spectrum and the result which is produced by the runtime on a mp3 soundfile.
With some modification, you can filter the data down to the required frequency bands and export it as a file, which then can be processed in FileMaker.
Have in mind that for a single recording, you may get tenthousands of spectra depending on sampling rate and length of the recording.
I don't think for the mentioned recordings this will work with Sonic's API, which specifies that the sound file must be monophic (one single instrument only).
To extract notes from polyphonic pieces or from recordings that have various voices and noises is still very difficult.
Advanced software such as Celemony's Melodyne can do this to some extent for polyphonic instruments, e.g. a guitar or a piano, if the instrument is recorded per single channel.
Have a look at the very interesting movie on the inventor of Melodyne at http://www.celemony.com/en/melodyne/what-is-melodyne (section "What does a stone sound like?").
Thank you all for your thoughts and contributions. I figured it would be complex. It might be better to piggy back on someone else's software who has done the waveform analyzing part already and see what export this could spit out. I'll keep all this in mind however.