FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » mysql dynamic binding and pass-by-ref deprecated
Show: Today's Messages :: Polls :: Message Navigator
Return to the default flat view Create a new topic Submit Reply
Re: Unicode support [message #180993 is a reply to message #180992] Sun, 31 March 2013 12:41 Go to previous messageGo to previous message
The Natural Philosoph is currently offline  The Natural Philosoph
Messages: 993
Registered: September 2010
Karma:
Senior Member
On 31/03/13 13:10, M. Strobel wrote:
> Am 30.03.2013 19:04, schrieb The Natural Philosopher:
>> On 30/03/13 14:55, Christoph Becker wrote:
>>> The Natural Philosopher wrote:
>>>
>>>> On 30/03/13 13:25, Thomas 'PointedEars' Lahn wrote:
>>>>
>>>> > The same issue exists with characters outside the BMP in ECMAScript
>>>> > implementations which uses 16-bit characters (usually one UTF-16 code
>>>> > unit
>>>> > per character). But you can work around that rather efficiently.
>>>> >
>>>>
>>>> The problem become 'what do you mean by strlen()' - the space the
>>>> characters will occupy in an constant width font, or the storage
>>>> allocated to the string?
>>>>
>>>> Mostly we are concerned with the latter.
>>>
>>> I am more concerned about the number of characters the string holds.
>>> Say, I want to get the last character:
>>>
>>> $str = '€';
>>> echo $str[2];
>>>
>>>> Because lack of precision in font reproduction, or even in guaranteeing
>>>> which font may be selected, renders the former an 'open' question.
>>>>
>>>> strlen('€')===3 is in fact the correct answer.
>>>
>>> I suppose most *higher level languages* define the length of a string as
>>> the number of characters the string holds. Cf. ECMAScript's length
>>> property and TCL's [string length]. Even PHP's mb_strlen() returns the
>>> number of characters.
>>>
>>
>> so what happens in a typographic ligature like 'ᴁ'?
>>
>> I think you are making a rod for your back here.
>>
>> The storage requirements are exact specific and useful.
>>
>> The concept of a 'character in a text string' is really not..and if you go deep into
>> typography with kerning, leading,ligature and the like and the like you will
>> understand why.
>>
>
> Of course you need both, the storage requirements, and direct access to characters.
> Maybe programming languages should use internally full 32 bit per char, or compress
> the unicode string using a good library for access.
>
> C does not even know strings, just byte arrays
>
Thereby avoiding the problems completely by not even pretending to
solve them.

And you can always wrote a unicode_strlen() to any specification you
want..the problem is..

....given that many 'characters' may take LESS than a byte (ligature) or
up to 3-4 bytes (unicode character sets).... what specification?

and the concept of a 'character' is practically valueless anyway..,

> /Str.
>
>


--
Ineptocracy

(in-ep-toc’-ra-cy) – a system of government where the least capable to
lead are elected by the least capable of producing, and where the
members of society least likely to sustain themselves or succeed, are
rewarded with goods and services paid for by the confiscated wealth of a
diminishing number of producers.
[Message index]
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: Need Forex Feed in PHP
Next Topic: can't get includes to load
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Wed Nov 27 21:43:04 GMT 2024

Total time taken to generate the page: 0.05764 seconds