FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » Cannot write utf8 data into a utf8 column
Show: Today's Messages :: Polls :: Message Navigator
Return to the default flat view Create a new topic Submit Reply
Re: Cannot write utf8 data into a utf8 column - SOLVED [message #170756 is a reply to message #170752] Fri, 19 November 2010 22:48 Go to previous message
Thomas 'PointedEars'  is currently offline  Thomas 'PointedEars'
Messages: 701
Registered: October 2010
Karma:
Senior Member
Tony Marston wrote:

> "Peter H. Coffin" <hellsop(at)ninehells(dot)com> wrote in message
> news:slrniebqgp(dot)1g0(dot)hellsop(at)abyss(dot)ninehells(dot)com...

It is called attribution _line_, not attribution novel.

>> ["Followup-To:" header set to comp.databases.mysql.]
>> On Thu, 18 Nov 2010 16:54:47 -0000, Tony Marston wrote:
>>> [garbled posting due to missing character encoding declaration]
>>>
>>> it fails with the following error:
>>>
>>> "Incorrect string value: '\xA0\xA0 \xE6\x88\x91...' for column
>>> 'message_text' at row 1"
>>>
>>> When I try the SAME update through SQL-Front or phpMyAdmin it works! Why
>>> is this?

Those frontends are apparently set up to use UTF-8 for the input character
encoding, so whenever you type/paste a character there, it is regarded the
corresponding Unicode character upon form submit.

PHP, on the other hand, uses the character encoding of the source file: If
you create the source file with e.g. notepad.exe on Windows, most certainly
it will have a Windows-125x character encoding by default.

>> If it were the SAME, it would work the same. MySQL gives this error when
>> there's invalid UTF-8 byte sequences, like a continuation byte without
>> starting byte, or a starting byte that is not followed by a continuation
>> byte... Find out what the hex for what you're trying to stick in
>> message_text and I bet it won't be what it should be.
>
> The error message was reporting a problem with the hex value \xA0 (decimal
> 160) which represents '&nbsp;' or the non-breaking space.

Only in ISO-8859-1/Windows-1252. In UTF-8, A0 is one of the aforementioned
continuation bytes. See below.

> I discovered that instead of replacing '&nbsp;' with chr(160) that I
> needed to replace it with chr(194).chr(160). I don't now why the chr(194)
> is necessary, but it solves my problem.

(Ignorance must be bliss.) If you want to grow beyond the script-kiddie
trial-and-error approach (which seldom is successful in programming), you
should want to learn *how* and *why* things work.

chr(194) . chr(160)

"works" *because* chr(194) returns an _ISO-8859-1/Windows-1252_ character at
code point 0xC2¹ ─ Â ─, and chr(160) returns an ISO-8859-1/Windows-125x
character at code point 0xA0 ─ <NBSP> ─, which is, by concatenation,
combined to `Â<NBSP>' which is an ISO-8859-1/Windows-1252 representation of
the byte sequence

C2 A0

In UTF-8, this is a sequence consisting of two UTF-8 code units (hence
UTF-*8*: 8 bits, or 1 byte, per code unit), encoding the character at
Unicode code point U+00A0 (NO-BREAK SPACE). (C2 is one of the start bytes
of a 2-byte sequence.)

So you should have RTFM and called the equivalent of

mysql_query(utf8_encode($query));

instead. See also:

<http://unicode.org/faq/>
<http://en.wikipedia.org/wiki/UTF-8>
<http://people.w3.org/rishida/tools/conversion/>
<http://php.net/chr>
<http://php.net/utf8_encode>


HTH

PointedEars
___________
¹ Neither character is part of ASCII, despite the PHP manual stating that
--
Danny Goodman's books are out of date and teach practices that are
positively harmful for cross-browser scripting.
-- Richard Cornford, cljs, <cife6q$253$1$8300dec7(at)news(dot)demon(dot)co(dot)uk> (2004)
[Message index]
 
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: store backslash in mysql database
Next Topic: Firewall - NetDefender :
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Fri Nov 22 16:32:40 GMT 2024

Total time taken to generate the page: 0.04252 seconds