FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » reduce all spaces to one
Show: Today's Messages :: Polls :: Message Navigator
Return to the default flat view Create a new topic Submit Reply
Re: reduce all spaces to one [message #176909 is a reply to message #176908] Sat, 04 February 2012 23:05 Go to previous messageGo to previous message
Thomas 'PointedEars'  is currently offline  Thomas 'PointedEars'
Messages: 701
Registered: October 2010
Karma:
Senior Member
John wrote:

> Am 04.02.2012 15:41, schrieb Thomas 'PointedEars' Lahn:
^^^^^^^
>> It should be noted that if your question was understood literally, the
>> solutions presented so far would be wrong. \s would match too many
>> different characters, as it stands for *white-space* in PCRE, _not_ only
>> the space character. In order to reduce only all consecutive *space*
>> characters to one space character, you need to write
>>
>> echo preg_replace('/ +/', ' ', "here are some \n spaces ");
>>
>> Note that the newline, which is white-space too, is preserved here.
>
> in fact, just by 'accident' the 'regex' solutions fits perfectly in what
> I have to do next, i.e. feed the text through an 'explode' statement,
> which takes ' ' as separator.

I do not think you have fully understood what I have said (maybe you want to
try posting to *de*.comp.lang.php instead?). *All* presented solutions so
far, including yours and mine, are "'regex' solutions". But the set of
characters that are replaced differs between them.

Anyhow, you do not need to replace spaces before word splitting. Instead of
PHP 5+ [1]

$words = str_split(' ', preg_replace('/\\s+/', ' ', $text));

since PHP 4 you can just split at consecutive white-space [2]:

$words = preg_split('/\\s+/', $text);

Also note that for finding the *words* in a text, this splitting at white-
space is _not_ sufficient. For example, in the sentence before splitting at
white-space would have resulted in words "text," "_not_", and "sufficient.".
Once you use preg_split(), though, there is a simple remedy (which can be
found at [2]). Just include all characters that you do not want to be part
of a word in the character class:

$words = preg_split('/[\\s,]+/', $text);

(I prefer to write `\\' instead of `\' even within single-quoted strings, in
order to make the `\' explicit. YMMV.)

You might also want to exclude periods (`.') and other punctuation from
words, like dashes and ellipses. Note that PCRE provides an escape sequence
for ASCII-non-word characters [3], and that it supports Unicode character
properties, which can be used to differentiate between letters and non-
letters in various natural languages [4].

You might find [5] useful, in particular [6], for the next time that you
post (translations are available there).


PointedEars
___________
[1] <http://php.net/str_split>
[2] <http://php.net/preg_split>
[3] <http://php.net/manual/en/regexp.reference.escape.php>
[4] <http://php.net/manual/en/regexp.reference.unicode.php>
[5] <http://catb.org/esr/faqs/smart-questions.html>
[6] <http://catb.org/esr/faqs/smart-questions.html#goal>
--
realism: HTML 4.01 Strict
evangelism: XHTML 1.0 Strict
madness: XHTML 1.1 as application/xhtml+xml
-- Bjoern Hoehrmann
[Message index]
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: Stats comp.lang.php (last 7 days)
Next Topic: Check email
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Sat Nov 23 10:47:20 GMT 2024

Total time taken to generate the page: 0.04684 seconds