Saturday, March 13, 2010

Blogger to stop FTP publishing...

Whatever was free once, is no longer free, after all, Google has to get money out of something else, other than ads...
I had used Blogger for a while, probably since their beginning when it was acquired by Google.
I never liked to host my blog on their servers though, and I still won't. Hosting on my own account gives me the freedom of owning anything I post, whether it is a rant, an image or just random IT knowledge.
Blogger has cut down features little by little. First the labels on the blog, then the amount of available HTML/CSS themes, now the FTP publishing.
Bottom line, you either host with them or you'll be eventually unable to use them.

I guess the conversion to Wordpress is mandatory now. No more procrastination for me.


PS. On the bright side, I'll be able to close my gmail account, I never use it and it is full of spam. I won't force the kind people who leave comments to have a Gmail account and I'll have full control on my ramblings...and the infrastructure that supports it.


Friday, March 12, 2010

To nvarchar or to varchar in SQL Server to accept French text

I found one single character from the French language that its binary representation is not the same for Unicode and Windows-1252. The oe ligature. Let me rant about it...

create table test_table
(name1 varchar (1) ,
name2 nvarchar (1) )

insert test_table
values ('œ', 'œ')

select * from test_table

select ASCII(name1), ASCII(name2), Unicode(name1), unicode(name2) from test_table

select COL_LENGTH('test_table','name1') as Length1, COL_LENGTH('test_table','name2') as Length2 from test_table

select char(156), char(339)

drop table test_table

This character is used in the word "eggs" in French
'des œufs' means some eggs...

These first two sets of characters from the Latin alphabet have identical Unicode and Windows-1252 (ASCII) character code.
(Basic Latin)

Their binary representation is the same as they all fit in one single byte or octet.

However, the extended Latin characters do not have the same Windows-1252 (ASCII) code and Unicode code.

From this character set, French only uses the oe ligature though.

French accents and ligatures and how to type them with the number pad:
a with grave accent
à ALT + 133 À ALT + 0192

a with circumflex
â ALT + 131 Â ALT + 0194

a with tréma
ä ALT + 132 Ä ALT + 142

a e ligature
æ ALT + 145 Æ ALT + 146

c with cedilla
ç ALT + 135 Ç ALT + 128

e with acute accent
é ALT + 130 É ALT + 144

e with grave accent
è ALT + 138 È ALT + 0200

e with circumflex
ê ALT + 136 Ê ALT + 0202

e with tréma
ë ALT + 137 Ë ALT + 0203

i with circumflex
î ALT + 140 Î ALT + 0206

i with tréma
ï ALT + 139 Ï ALT + 0207

o with circumflex
ô ALT + 147 Ô ALT + 0212

o e ligature
œ ALT + 0156 Œ ALT + 0140

u with grave accent
ù ALT + 151 Ù ALT + 0217

u with circumflex
û ALT + 150 Û ALT + 0219

u with tréma
ü ALT + 129 Ü ALT + 154

French quotation marks
« ALT + 174 » ALT + 175

Euro symbol
€ ALT + 0128

The Windows-1252 encoding can be seen here:

For more, see Joel on Software rant :-p

As most questions in technology, the answer is: it depends.

Performance wise, varchar is more efficient, less memory space, 20% to 30% smaller indexes.
Most database drivers will interpret the incoming stream and convert to Windows-1252 encoding, if the server code page is Windows 1252.
If you use way too many characters in the extended Latin group, you have no choice but using nvarchar...if you deal with other languages that are not Romance Languages, you have no choice but nvarchar.


Labels: ,

Thursday, March 04, 2010

I couldn't stop laughing...