FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » What this means "(\w|-)+@\w"?
Show: Today's Messages :: Polls :: Message Navigator
Return to the default flat view Create a new topic Submit Reply
Re: What this means "(\w|-)+@\w"? [message #174786 is a reply to message #174784] Thu, 07 July 2011 18:23 Go to previous messageGo to previous message
Eli the Bearded is currently offline  Eli the Bearded
Messages: 22
Registered: April 2011
Karma:
Junior Member
In comp.lang.php, zhang yun <bigzhangyun(at)gmail(dot)com> wrote:
> Hi,there
> I have some problems in regular expression, here is code :
> function is_email($e_name)
> {

That's a poorly named function. Lots and lots of valid email
addresses will be rejected by this crappy code.

> $arr = array("ac","com","net","org","edu","gov","mil","ac\.cn",
> "com\.cn","edu\.cn","net\.cn","org\.cn");

Have you heard about the new top level domains (TLDs)? .info is new as
of, oh, about year 2000. There are some others, too. And other country
codes.

> $str = implode("|",$arr);

This bit turns that array into a regular expression fragment.

> $reg ='/^[0-9a-zA-z](\w|-)+@\w+\.('.$str.')$/';

This bit makes a larger "perl compatable" regular expression ("pcre").

^ anchor to start of string
[0-9a-zA-z] broken attempt to match a number or letter
will also match [ ] \ ^ _ `
(\w|-)+ will match 1 or more ("+") of
letter, number or underscore ("\w")
or ("|")
hyphen
@ at-sign
\w+ will match 1 or more ("+") of
letter, number or underscore ("\w")
\. dot (period)
($str) will match one of the top level domains
in the array
$ anchor to end of string

Here are some problems:

The user name ("localpart" in RFC speak) of the email address
can contain many more characters than are allowed here. I'm
using a valid email address with no numbers or letters in it,
for example. Periods are very common in localparts but this
will not allow them.

The user name can be one character, which this does not allow.

Domain names can contain hyphens ("-") but not underscores
("_"), but this allows only underscores and not hyphens.

Domain names can contain more top level domains than that.

Domain names can contain more dot segments ("labels" in RFC
speak) than this allows.

Here's the regular expression I used to "verify" email addresses:

$reg = ".@.";

. any single character
@ at-sign
. any single character

no anchoring, so just matching a substring,
not the whole email address

Sure, some bad ones slip through, but no good ones get rejected.

Elijah
------
uses this email address precisely because it gets rejected by bad verifiers
[Message index]
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: upgrade of php & split deprecated = HELP
Next Topic: Re: What is Islam?
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Fri Nov 22 21:42:02 GMT 2024

Total time taken to generate the page: 0.04577 seconds