- How can I detect and convert a UTF-16 Little-Endian to UTF-8 encoded csv file so FileMaker can import it.
Quotation marks are only needed in CSV if the field contains a line-break, comma, or quotation mark (quotation marks should be escaped with an additional quotation mark). Is that the situation with this file? Otherwise, something else is going on...
Fields contain none of the above. I cannot see in sublime what character the line break or new line is between records. These files as other I have seen that FileMaker has trouble importing. Can be opened by Mac Numbers, saved as csv or xslx but again have the same problem regardless of format and can open opened by FileMaker as a new database but cannot be imported as either a csv or xslx file. The file imports as blank.
Can you attach an example of such file (preferably zipped) here?
If you use MBS FileMaker Plugin, you can use the CSV text functions there.
e.g. FM.InsertRecordCSV function can import CSV text. You pass in file name, table name, the list of field names and the text. The delimiter is normally auto detected, but you can of course also provide the delimiter.
Disclaimer for community user agreement:
I may receive some form of compensation, financial or otherwise, from my recommendation or link.
Agreed, a sample would help.
I just created a sample CSV file with no quoted fields and that FMP had no problems at all importing.
I imported this file two ways:
1.Dragging the csv icon to the FMP icon.
2.Doing a File...Import Records...File within FMP
The main problem I am having scripting: that FileMaker will actually import the file but will not show you the data after you specified the data source and try to specify the import order even if you try to flip through records to see any data. PS: How can I attach a csv or zip file here. The forum does not allow me to attach these file types.
Here is what I see when I flip through records, zippo.
This does not directly address drag and drop, but I use the techniques in the attached file to import any type files to Filemaker, csv and other files.
Perhaps you can find it useful. Works in OSX, Windows, and Filemaker Go.
Note: There is also a useful Custom Functions for processing csv values.
chuckburr wrote: PS: How can I attach a csv or zip file here. The forum does not allow me to attach these file types.
PS: How can I attach a csv or zip file here. The forum does not allow me to attach these file types.
You cannot view or attach files in the "Inbox". Go to the thread here: Drag and Drop csv files into FMP - fix missing csv quotes , then add a new comment and click "Use advanced editor" at the top right, then "Attach" at the bottom right.
To attach files: When you are typing a response, at the top right, click on - Use advanced Editor -. Then you will get the attach a file functionality.
Here is a sample csv file. It does import, but the data is not viewable as you try to specify the import order.
PS: Yes the darn thing has blank rows every other row, haha. Any suggestion on the easiest way to detect and remove them as well beside sorting the import and deleting blank rows.
This file may help you with drag and drop to a container. It uses Set Script Trigger to OnObjectModify to run a script.
This file and with the one above, that I posted: Import Any File_13.fmp12 should I allow you do what you want.
There are many ways to eliminate blank lines from csv. One you have the text of the csv in a text field, you can run a script to replace the double Carriage Return [ Chr(13) & Chr(13) ], with a single Carriage Return [ Car(13) ].
Do you see anything funky with the file I attached above. It imports but does not show field content when you try to set field import order.
chuckburr, I would not use set field import order. Once the the csv text is a text field, I would parse the rows and columns values to their proper location. Remember the file above: Import Any File_13.fmp12 uses a custom function to parse the values from a csv file once the file text has been inserted into a Text Field.
I have a use case where the user chooses the data source via a choice on a value list, then imports their file. I train script's import order for each data source. The same set of 8 or so values comes in a different order from the different sources. Think financial market trades: amount, price, timestamp, commission. The same fields but in different order depending on the source.
Again, what is unusual about the content of this file that I cannot see the data to set the field import order?
You can only use field import order when you are transferring from one table to another table. A csv file is not a table.
I am afraid you couldn't be more wrong about this. You can map source columns to target fields at will, with any type of file being imported. You can even let FMP do it automatically, if the first row contains field names.
You cannot see the data in Import Field Mapping, because FMP does not know how to parse your file. You say it is a CSV file - apparently FMP does not think so. We won't know why until we see an example of it. One thing we do know for sure: it is not because it is "missing quotes around the content" as suggested in your OP.
Rather than worry about how FileMaker does things or debate it, I just solved the problem.
To wit, I just wrote a quick method in my existing micro-service that lets you update a text file with:
1. Delimiters between fields.
2. Delimiters around fields (like a quote)
You can change either of the ones above to something else.
Note: this code is EXTREMELY fast. On a 500,000 line text file, it completes in less than 0.5 seconds!
Here's a sample text file, called Employees.csv:
Here's how to call the new micro-service logic in FileMaker (same logic could be called by other applications as well!) to do the heavy lifting (including locking the file we're writing to, a shared resource, for proper synchronization):
Updated Employees.csv file (named "Employees_updated.out.csv"):
And, of course, the utility will go the other direction, too:
Gives you your original file back.
So, I guess my point is, rather than debate the issue, why not just work around it?
In about an hour, I created a flexible delimiter updating program that gives you total flexibility. This is the kind of utility most database front-ends, like Navicat, give you out of the box. But, hey, no big deal. With a little work, it's now working and can be called from any HTTP-enabled program! Also, since it didn't add value to the sample solution, I didn't automate for this example the trivial task to add a space for any empty field.
“Don’t document the problem, fix it.”
– Atli Björgvin Oddsson
When I open your CSV in BBEdit, this is what I see:
This isn't a well-behaved text file.
The problem with your file is its unusual encoding. It is UTF-16 Little Endian, and FMP has some difficulty with it. Still, it can be imported just fine. The problem with the "empty" rows (which aren't really empty) can be solved using field validation - see the attached demo.
Looks like it's UTF-16 (?) perhaps. What does your text editor say?
A small HEX-DUMP gives me this:
0000: 4F 00 72 00 64 00 65 00 72 00 55 00 75 00 69 00 O.r.d.e.r.U.u.i.0010: 64 00 2C 00 45 00 78 00 63 00 68 00 61 00 6E 00 d.,.E.x.c.h.a.n.0020: 67 00 65 00 2C 00 54 00 79 00 70 00 65 00 2C 00 g.e.,.T.y.p.e.,.0030: 51 00 75 00 61 00 6E 00 74 00 69 00 74 00 79 00 Q.u.a.n.t.i.t.y.0040: 2C 00 4C 00 69 00 6D 00 69 00 74 00 2C 00 43 00 ,.L.i.m.i.t.,.C.0050: 6F 00 6D 00 6D 00 69 00 73 00 73 00 69 00 6F 00 o.m.m.i.s.s.i.o.0060: 6E 00 50 00 61 00 69 00 64 00 2C 00 50 00 72 00 n.P.a.i.d.,.P.r.0070: 69 00 63 00 65 00 2C 00 4F 00 70 00 65 00 6E 00 i.c.e.,.O.p.e.n.0080: 65 00 64 00 2C 00 43 00 6C 00 6F 00 73 00 65 00 e.d.,.C.l.o.s.e.0090: 64 00 0A 00 0A 00 38 00 64 00 32 00 37 00 67 00 d.....8.d.2.7.g.00A0: 64 00 61 00 32 00 2D 00 63 00 62 00 62 00 32 00 d.a.2.-.c.b.b.2.00B0: 2D 00 34 00 39 00 31 00 37 00 2D 00 61 00 36 00 -.220.127.116.11.-.a.6.00C0: 31 00 33 00 2D 00 63 00 31 00 36 00 39 00 35 00 1.3.-.c.18.104.22.168.00D0: 30 00 66 00 33 00 61 00 36 00 66 00 32 00 2C 00 0.f.3.a.6.f.2.,.00E0: 42 00 54 00 43 00 2D 00 45 00 54 00 48 00 2C 00 B.T.C.-.E.T.H.,.00F0: 4C 00 49 00 4D 00 49 00 54 00 5F 00 42 00 55 00 L.I.M.I.T._.B.U.
0000: 4F 00 72 00 64 00 65 00 72 00 55 00 75 00 69 00 O.r.d.e.r.U.u.i.
0010: 64 00 2C 00 45 00 78 00 63 00 68 00 61 00 6E 00 d.,.E.x.c.h.a.n.
0020: 67 00 65 00 2C 00 54 00 79 00 70 00 65 00 2C 00 g.e.,.T.y.p.e.,.
0030: 51 00 75 00 61 00 6E 00 74 00 69 00 74 00 79 00 Q.u.a.n.t.i.t.y.
0040: 2C 00 4C 00 69 00 6D 00 69 00 74 00 2C 00 43 00 ,.L.i.m.i.t.,.C.
0050: 6F 00 6D 00 6D 00 69 00 73 00 73 00 69 00 6F 00 o.m.m.i.s.s.i.o.
0060: 6E 00 50 00 61 00 69 00 64 00 2C 00 50 00 72 00 n.P.a.i.d.,.P.r.
0070: 69 00 63 00 65 00 2C 00 4F 00 70 00 65 00 6E 00 i.c.e.,.O.p.e.n.
0080: 65 00 64 00 2C 00 43 00 6C 00 6F 00 73 00 65 00 e.d.,.C.l.o.s.e.
0090: 64 00 0A 00 0A 00 38 00 64 00 32 00 37 00 67 00 d.....8.d.2.7.g.
00A0: 64 00 61 00 32 00 2D 00 63 00 62 00 62 00 32 00 d.a.2.-.c.b.b.2.
00B0: 2D 00 34 00 39 00 31 00 37 00 2D 00 61 00 36 00 -.22.214.171.124.-.a.6.
00C0: 31 00 33 00 2D 00 63 00 31 00 36 00 39 00 35 00 1.3.-.c.126.96.36.199.
00D0: 30 00 66 00 33 00 61 00 36 00 66 00 32 00 2C 00 0.f.3.a.6.f.2.,.
00E0: 42 00 54 00 43 00 2D 00 45 00 54 00 48 00 2C 00 B.T.C.-.E.T.H.,.
00F0: 4C 00 49 00 4D 00 49 00 54 00 5F 00 42 00 55 00 L.I.M.I.T._.B.U.
The '00' is the character that you see as the '¿' (option+shift+/ on the macOS keyboard)
This Text file (.CSV) would be a problem with most anything trying to read it.
Even my Excel throws this error:
Excel does not recognize this file formatThis file may have been created in an application that is not compatible with Excel. To save the file in a different format, click Cancel, open the file in the application that it was created in, and then save it in a format that is compatible with Excel, such as Comma Separated Values(.csv).If you still want to open the file in its original format,click OK, and then follow the instructions on the screen to import the file.
Excel does not recognize this file format
This file may have been created in an application that is
not compatible with Excel. To save the file in a different
format, click Cancel, open the file in the application that
it was created in, and then save it in a format that is
compatible with Excel, such as Comma Separated Values
If you still want to open the file in its original format,
click OK, and then follow the instructions on the screen
to import the file.
NOTE: I can "open" by clicking the OK in Excel, but each "line" fills into a single column in Excel
It took a little playing, but I save in Excel as '.txt' This then seemed to import into FileMaker just fine.
I re-opened the file in BBEdit using Little Endian, and saved it off as another CSV.
That file then imports with no problems.
(Converting UTF-16 to UTF-8 is also easy to do programmatically.)
Confirmed. BBEdit and Reopen Using Encoding Unicode (UTF-16 Little-Endian) makes these "converted" and then saving back as UTF-8 allows Excel to open correctly! As well in FileMaker Pro 17.
Here is the business problem. I have lay user customers who will be importing their own files, so UTF 16 and 8 are over their head. How can we detect in a script UTF 16 and convert to UTF 8 in a script on the fly so they just think it works magically?
What version of FileMaker Pro, chuck?
Are you importing into a container or a single field?
By programmatically converting UTF-16 to UTF-8 do you mean micro service vs. FM scripting?
Can we detect UTF-16 encoding in a FM script and at least reject it, asking the user to save their file as UTF-8?
I am rolling with FM Advanced 188.8.131.52
Michael Horak, Your Comment:
"I am afraid you couldn't be more wrong about this. You can map source columns to target fields at will, with any type of file being imported. You can even let FMP do it automatically, if the first row contains field names."
I stand corrected, your comment is very correct !!!
Do you 'know' that the first 'word' of the file is OrderUuid each time??
BE_SetTextEncoding ( "UTF-8" )
BE_SetTextEncoding ( "UTF-16BE" )
BE_SetTextEncoding ( "UTF-16LE" )
before BE_FileReadText ( "path/to/file" )
Test If ( leftwords ( line ;1 ) = "OrderUuid"
in this case the third will give you OrderUuid,Exchange,Type,Quantity,Limit,CommissionPaid,Price,Opened,Closed as the first line...
so Substitute ( theData ; [¶ & ¶ ; ¶] ; [¶ & ¶ ; ¶] )
Then export this data as a known path/file and carry on with your import
substitute == [ Char(13) & Char(10) ; ¶ ]
chuckburr, … I understand you may have solved this issue. Great to hear, am sure it was an awesome learning adventure; but, here is another solution, an FM file that imports your UTF-16LE file, and removes the character(s) of the UTF-16LE encoding from the text. This is performed entirely within FM, without a micro service. It also uses a custom function that can be utilized for parsing the csv data. The custom function is taken from the Brian Dunning Custom Function website.
Would this website aid you in converting? (Assume you know the file is UTF-16.
Cool thank you. The filers are from the general public, so we do not know the encoding in advance. We request UTF-8 but most lay users have no idea what their file encoding is.
chuckburr, just for more info on this, in the FM files I have provided above, they rely in exporting a file from a container field to the temp folder, using a file name with a .txt. Then using Insert from URL to place the the content of the Temp Folder to a text field.
To keep it short, the info is that using FM Go, UTF16 does not seem to be an issue. Once the file is inserted to a text field from the temp folder using Insert from URL, and then using Export Field Content it appears to be UTF-8. So this may be another way to the conversion from UTF-16.
The enclosed file works in OSX, Windows and iOS. The attached file also uses parsing of the CSV data to create the table, using the custom function csv_GetValue.
Correct! FMGo has the Export Field Contents (from a Text Field) as UTF-8 by default. The issue occurs only from the Text Field in FileMaker Pro (UTF-16). If using a Container (where the text 'file' has been inserted) the Export Field Contents will preserve whatever encoding and line ending have been applied to the text. This is why we use the Base64 method(s) or the TextEncode() method to PUSH THE TEXT FIELD CONTENTS TO A CONTAINER FIELD. Then the Export Field Contents is ok (as you desire).
Caveat, is the not supported on FMServer, of course!
PLEASE VOTE FOR consistency and support on all versions and platforms for Export Field Contents !!
Just one more; - was searching thru Brian Dunning Filemaker Custom Functions for something else, and discovered: https://www.briandunning.com/cf/1652
This functions strips out some lower and some upper ASCII characters from a text. Yep, it cleans up, as in removes UTF-16(LE) characters from a text. Then one can process and or export (converted) a clean UTF-8 file, all with native FM.
No, the two are unconnected. File encoding is file encoding, no matter which characters it contains.
That is as it is, but if the objective is to work with the file that the OP provided for his data (UTF-16LE), it does the job of removing undesirable characters once the file text is captured. And without going external to FM.
Although I didn't see where Chuck's interesting posting stated the conversion needed to happen without 'going external to FM', I would like to know, having been watching this thread, what Chuck ended up doing to solve his problem. Several solutions have been presented. Of course I'm partial to micro-services as this entire class of problems was already solved in a method I created to convert any kind of binary document to text (using a free widely-used Apache library). UTF-16 was just one more file type that was convertible.
Then, whether to populate a simple text field in FMP from that UTF-16 file (or any other binary content) or just create a UTF-8 file from it, or whatever, is totally doable and programmable....up to the developer (not a third party). The quick example I came with both took the UTF-16 and inserted it into a regular text field and re-wrote the same file in the same directory where it was read. Best yet, for those of us who work with in heterogeneous environments where multiple software clients need to leverage code we write (that is, not just FMP), that same micro-service code works with any HTTP-enabled application. (But, I digress...)
So Chuck, please tell us how you ended up solving this issue.
One day soon, will have to get into learning about Micro-Services. In the mean time, with regards to importing csv files encoded with UTF-16LE, there seems to be an unprintable character (usually designated as a ?) between the printable characters of the values that are separated by commas. Though a text field does not show it, a WebViewer in OSX will show that there is an unprintable character. A WebViewer in FM Go will not show it, but the unprintable character is there.
It is relative simple to extract that unprintable character, and removed it. Then place the edited text in a Text Field. The edited text field can then exported to the Temp folder as csv, and Inserted from the Temp Folder to a container as csv. This edited file in the container is now well behaved for processing.
Attached is file showing the technique for newbies.
Quick note, this is still on our dev list, will keep you posted.
Side note, it would be nice if FileMaker split its development efforts: 1) Roadmap 2) Features. It's expensive and frustrating to waste so much time on many things that should just be easy, automatic or built in. Thanks everyone for their input.
chuckburr wrote: It's expensive and frustrating to waste so much time on many things that should just be easy, automatic or built in.
It's expensive and frustrating to waste so much time on many things that should just be easy, automatic or built in.
I totally agree.
Retrieving data ...