8 Replies Latest reply on Aug 29, 2014 5:33 AM by jimlongo

    importing UTF-8 csv

    jimlongo

      Summary

      importing UTF-8 csv

      Product

      FileMaker Pro

      Version

      12

      Operating system version

      Mac OS X 10.7

      Description of the issue

      When importing a csv file that is encoded as UTF-8 accented characters are imported incorrectly.

      Case in point Véronique is imported as V√©ronique.

      File to be imported must be converted to UTF-8 with BOM (byte order mark) to be imported correctly.

      Workaround

      File must be UTF-8 with BOM (byte order mark)

        • 1. Re: importing UTF-8 csv
          TSGal

               jimlongo:

               Thank you for your post.

               I am unable to replicate the issue.

               I have copied the text V√©ronique and pasted it into TextEdit.  I saved the file as Names.csv.  I then imported Names.csv as Unicode (UTF-8), and it is imported as Véronique.

               Can you let me know where this file came from?  Was it created on a Mac or Windows computer?  What application created the CSV file?

               TSGal
               FileMaker, Inc.

          • 2. Re: importing UTF-8 csv
            jimlongo

                 Thanks it's the other way around.

                 The file is created by a php file on the server and downloaded as filename.csv

                 If I inspect it with Bbedit it says it is UTF-8 without BOM.  In the file it will have Véronique.  After importing into Fmpro 12 it will have V√©ronique.

                 If I change the file in BBEdit (or if I add the headers during creation in the php file) so that it is UTF-8 with BOM then Véronique gets imported into Filemaker as Véronique.

                  

                 Thanks.

            • 3. Re: importing UTF-8 csv
              TSGal

                   jimlongo:

                   Thank you for the additional explanation.

                   I have forwarded your post to our Development and Testing departments for review.  When I receive any feedback, I will let you know.

                   TSGal
                   FileMaker, Inc.

              • 4. Re: importing UTF-8 csv
                TSGal

                     jimlongo:

                     One of our testers asked if in the Import field mapping dialog, what is the character pop-up set to?  CSV?  If you rely on FileMaker Pro guessing (wrongly) the character set, either set it explicitly or use a byte order mark.  Otherwise, you are giving FileMaker a text file with no cues as to the encoding.

                     TSGal
                     FileMaker, Inc.

                • 5. Re: importing UTF-8 csv
                  jimlongo

                       Sorry for the delay.  

                       The character set pop-up is always set to (Unicode UTF-8), there is no option for CSV.

                        

                  • 6. Re: importing UTF-8 csv
                    TSGal

                         jimlongo:

                         Apologies for my last post.  "CSV" is not a character set.  I wanted to make sure the Character Set was specifically set to Unicode (UTF-8), which it is.

                         I have sent the screen shot back to Development and Testing.

                         TSGal
                         FileMaker, Inc.

                    • 7. Re: importing UTF-8 csv
                      TSGal

                           jimlongo:

                           Our testers would like a copy of the CSV file.  Please check your inbox at the top of this page for instructions where to send the file.

                           TSGal
                           FileMaker, Inc.

                      • 8. Re: importing UTF-8 csv
                        jimlongo

                             Hi I discovered the operator error causing this issue.

                             The files were apparently being imported with the wrong Character set (Macintosh). This will cause odd character mappings.

                             The act of sending Filemaker UTF-8 with BOM files forces the import to select UTF-8 as the character set and this fixed the problem, and made it seem as if the character set of the files was the issue.

                             However, now even going back to files with no BOM still imports correctly as the Character set retains the previous setting (UTF-8) on import.

                             Solution is to make sure you are using the correct Character set on import.