By Peter Bell

CRUD for International Strings

I'm obviously not the expert at this, but I needed to allow editing of double byte characters and the like in CF. Here was what worked for me . . .

[More]

Comments
these are all part of what i've called good i18n practices. the BOM is optional by definition, so you can't rely on it always being there if your code, etc is handled by something that plays fast & loose w/the BOM (cfstudio comes to mind). you're better off including setEncoding, cfcontent, etc. in your code. you're right that the metaheader has no influence on the cf server but including it is another good practice for forensic purposes ("this is what i intended the encoding to be") & there are plenty of brain dead apps that can't decipher a web page's encoding except by having its nose rubbed it via metaheader encoding hinting (some screen readers for example).

no, the "N" unicode hint won't work w/mysql. it's a sql server thing.

if you haven't already read the g11n chapter in ben's advanced cf book. pay attention to those boring tables, core java's locale support will bite your ankles off eventually. hang around the java i18n forums, they have been at this game a lot longer than any of us cf folks).
# Posted By PaulH | 5/14/07 1:39 PM
I am by no means an expert, but I believe if you specify in your db columns nvarchar, ntext, etc. then you don't need the DSN special config nor do you need to alter your SQL with the n prefix.

My understanding is that the DSN checkbox and prefixing is if you need to support multi byte and your db column types are already set. Or you don't want the extra space taken that is associated with nvarchar and the like.
# Posted By Joshua Cyr | 5/14/07 2:51 PM
Hi Joshua,

Actually, not true. You do need to use nvarchar, ntext and the like, but you ALSO have to add the "N" to your inserts/updates. I'm not SURE if you have to check the box as I didn't try to get rid of that, but I know that without the N (using MSSQL) it didn't work.
# Posted By Peter Bell | 5/14/07 4:12 PM
Interesting. I never use the N prefix and my double byte content works quite well. We also don't use the checkbox in the datasource. This is true also for any shared host thus far where we don't have control over the DB or datasource.
# Posted By Joshua Cyr | 5/14/07 4:17 PM
Hi Joshua, MS SQL? The N thing in MSSQL only, but without it, this wasn't working and with it, it all worked fine. According to Paul Hastings, the check box only relates to cfqueryparams.

http://www.sustainablegis.com/blog/cfg11n/index.cf...

Curiouser and curiouser!
# Posted By Peter Bell | 5/14/07 4:29 PM
Some clarification. I think cfqueryparam negates the need of n prefix. Just reviewed my datasource and ran some tests. I use cfqueryparam no prefix. And no checkbox in my datasource and can copy paste save, etc. happily. Here is a sample page with sanskrit, chinese and arabic for testing.

http://dev.besavvy.com/hi.cfm
# Posted By Joshua Cyr | 5/14/07 4:30 PM
Interesting! And that is on MSSQL?
# Posted By Peter Bell | 5/14/07 4:36 PM
Yup. just tested with CFMX 6.something and also with CFMX 7. Using both MS SQL 7 and MS SQL 2000. Happy to chat directly if you want at some point.

MySQL is its own pain. Some drivers need the connection string while others for some reason do not.
# Posted By Joshua Cyr | 5/14/07 4:39 PM
Interesting - cool to know. Sure I'll catch you about this offline some time!
# Posted By Peter Bell | 5/14/07 4:46 PM
joshua, you're forcing the sql server to do implicit conversion on your text data so there's a performance penalty to pay for not using "N" unicode hint in "normal" SQL. you're also going to eventually get into trouble if the data isn't unicode. you won't notice this for ASCII range (well maybe latin-1 as well) but you'll get burned if you need to use data that's outside that range. you might also have collation issues (unicode hinted data is a "unicode constant" otherwise sql server thinks it's "character constant" & applies code page rules to it).

no, you don't need to use (and i'm not sure you even could) the "N" unicode hint for cfqueryparam data BUT you do have to turn on the unicode option for that DSN for it to work consistently. folks using ray's blog i think will testify to the good practice of turning on this option.

one of the macromedia/adobe cf folks posted a good blog entry on this but i can't for the life of me wrinkle it out of google or mxna.
# Posted By PaulH | 5/14/07 10:57 PM
Gotcha. While I haven't run into performance or other issues I am sure it is just a matter of time. I will put into our install notes to be sure to check the box. Thanks Paul.
# Posted By Joshua Cyr | 5/15/07 9:56 AM
BlogCFC was created by Raymond Camden. This blog is running version 5.005.