FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » What this means "(\w|-)+@\w"?
Show: Today's Messages :: Polls :: Message Navigator
Switch to threaded view of this topic Create a new topic Submit Reply
What this means "(\w|-)+@\w"? [message #174784] Thu, 07 July 2011 13:58 Go to next message
zhang yun is currently offline  zhang yun
Messages: 3
Registered: July 2011
Karma: 0
Junior Member
Hi,there
I have some problems in regular expression, here is code :
function is_email($e_name)
{
$arr = array("ac","com","net","org","edu","gov","mil","ac\.cn",
"com\.cn","edu\.cn","net\.cn","org\.cn");
$str = implode("|",$arr);
$reg ='/^[0-9a-zA-z](\w|-)+@\w+\.('.$str.')$/';
//echo $reg;
if(preg_match($reg,strtolower($e_name)))
return strtolower($e_name);
else
return false;
}

I don't know the exactly meaning of "(\w|-)+@\w", could you help me
figure out its meaning? I'm new to posix standard.
Re: What this means "(\w|-)+@\w"? [message #174785 is a reply to message #174784] Thu, 07 July 2011 15:39 Go to previous messageGo to next message
Michael Fesser is currently offline  Michael Fesser
Messages: 215
Registered: September 2010
Karma: 0
Senior Member
.oO(zhang yun)

> I have some problems in regular expression, here is code :
> […]
>
> I don't know the exactly meaning of "(\w|-)+@\w", could you help me
> figure out its meaning? I'm new to posix standard.

(\w|-) any word char (see manual for details [1]) or a minus sign
+ any number of the above, but at least one
@ a literal
\w a single word char

So this matches at least one, but up to any number of word chars and
minus signs, followed by a literal '@', followed by a single word char.

An alternative way of writing (\w|-) would be [\w-], which replaces the
OR-operator by a character class, that includes word chars and the minus
sign.

HTH
Micha

[1] http://www.php.net/manual/en/regexp.reference.escape.php
Re: What this means "(\w|-)+@\w"? [message #174786 is a reply to message #174784] Thu, 07 July 2011 18:23 Go to previous messageGo to next message
Eli the Bearded is currently offline  Eli the Bearded
Messages: 22
Registered: April 2011
Karma: 0
Junior Member
In comp.lang.php, zhang yun <bigzhangyun(at)gmail(dot)com> wrote:
> Hi,there
> I have some problems in regular expression, here is code :
> function is_email($e_name)
> {

That's a poorly named function. Lots and lots of valid email
addresses will be rejected by this crappy code.

> $arr = array("ac","com","net","org","edu","gov","mil","ac\.cn",
> "com\.cn","edu\.cn","net\.cn","org\.cn");

Have you heard about the new top level domains (TLDs)? .info is new as
of, oh, about year 2000. There are some others, too. And other country
codes.

> $str = implode("|",$arr);

This bit turns that array into a regular expression fragment.

> $reg ='/^[0-9a-zA-z](\w|-)+@\w+\.('.$str.')$/';

This bit makes a larger "perl compatable" regular expression ("pcre").

^ anchor to start of string
[0-9a-zA-z] broken attempt to match a number or letter
will also match [ ] \ ^ _ `
(\w|-)+ will match 1 or more ("+") of
letter, number or underscore ("\w")
or ("|")
hyphen
@ at-sign
\w+ will match 1 or more ("+") of
letter, number or underscore ("\w")
\. dot (period)
($str) will match one of the top level domains
in the array
$ anchor to end of string

Here are some problems:

The user name ("localpart" in RFC speak) of the email address
can contain many more characters than are allowed here. I'm
using a valid email address with no numbers or letters in it,
for example. Periods are very common in localparts but this
will not allow them.

The user name can be one character, which this does not allow.

Domain names can contain hyphens ("-") but not underscores
("_"), but this allows only underscores and not hyphens.

Domain names can contain more top level domains than that.

Domain names can contain more dot segments ("labels" in RFC
speak) than this allows.

Here's the regular expression I used to "verify" email addresses:

$reg = ".@.";

. any single character
@ at-sign
. any single character

no anchoring, so just matching a substring,
not the whole email address

Sure, some bad ones slip through, but no good ones get rejected.

Elijah
------
uses this email address precisely because it gets rejected by bad verifiers
Re: What this means "(\w|-)+@\w"? [message #174787 is a reply to message #174786] Thu, 07 July 2011 19:12 Go to previous messageGo to next message
Michael Fesser is currently offline  Michael Fesser
Messages: 215
Registered: September 2010
Karma: 0
Senior Member
.oO(Eli the Bearded)

> In comp.lang.php, zhang yun <bigzhangyun(at)gmail(dot)com> wrote:
>> Hi,there
>> I have some problems in regular expression, here is code :
>> function is_email($e_name)
>> {
>
> That's a poorly named function. Lots and lots of valid email
> addresses will be rejected by this crappy code.
>
>> $arr = array("ac","com","net","org","edu","gov","mil","ac\.cn",
>> "com\.cn","edu\.cn","net\.cn","org\.cn");
>
> Have you heard about the new top level domains (TLDs)? .info is new as
> of, oh, about year 2000. There are some others, too. And other country
> codes.

ACK. And a lot more TLDs with arbitrary length are about to come.

In fact checking an email address with regular expressions is almost
impossible. It would make more sense to just perform a rough check like
..+@.+ and then just take it as-is. If it's valid, the email will be
sent. If not, it will bounce, but who really cares?

Micha
Re: What this means "(\w|-)+@\w"? [message #174788 is a reply to message #174786] Thu, 07 July 2011 19:12 Go to previous messageGo to next message
Thomas 'PointedEars'  is currently offline  Thomas 'PointedEars'
Messages: 701
Registered: October 2010
Karma: 0
Senior Member
Eli the Bearded wrote:

> In comp.lang.php, zhang yun <bigzhangyun(at)gmail(dot)com> wrote:
>> $reg ='/^[0-9a-zA-z](\w|-)+@\w+\.('.$str.')$/';
>
> This bit makes a larger "perl compatable" regular expression ("pcre").

_P_erl_-_compat_i_ble (_PCRE_). (You nitpick, I nitpick ;-))

> ^ anchor to start of string
> [0-9a-zA-z] broken attempt to match a number or letter
> will also match [ ] \ ^ _ `

Now come on, that is probably just a typo. Make the last `z' a `Z'
(uppercase) and it will not match those extra characters (it will
match too few addresses, though).

> Elijah
> ------
> uses this email address precisely because it gets rejected by bad
> verifiers

ACK. Good to see it passed mine (and was subsequently SMTP-checked as being
OK) :)

atext="[A-Za-z0-9!#\$%&'*+/=?^_\`{|}~-]"
dot_atom_text="$atext+(\\.$atext+)*"
dot_atom=$dot_atom_text

if [ -z "`echo "$i" | egrep -e "${dot_atom}@${dot_atom}"`" ]; then
# not syntactically valid
fi

It is a shell script, but you can probably see how it works. It is derived
directly from RFC 2822, section 3.4.1, which was current at the time it was
written. Do I need to update it to RFC 5322 in any way?


PointedEars
--
Anyone who slaps a 'this page is best viewed with Browser X' label on
a Web page appears to be yearning for the bad old days, before the Web,
when you had very little chance of reading a document written on another
computer, another word processor, or another network. -- Tim Berners-Lee
Re: What this means "(\w|-)+@\w"? [message #174789 is a reply to message #174788] Fri, 08 July 2011 06:05 Go to previous messageGo to next message
Eli the Bearded is currently offline  Eli the Bearded
Messages: 22
Registered: April 2011
Karma: 0
Junior Member
In comp.lang.php, Thomas 'PointedEars' Lahn <php(at)PointedEars(dot)de> wrote:
> Eli the Bearded wrote:
>> [0-9a-zA-z] broken attempt to match a number or letter
>> will also match [ ] \ ^ _ `
> Now come on, that is probably just a typo. Make the last `z' a `Z'
> (uppercase) and it will not match those extra characters (it will
> match too few addresses, though).

Yeah, but did you notice the strtolower() later on? It's just broken.

>> uses this email address precisely because it gets rejected by bad
>> verifiers
>
> ACK. Good to see it passed mine (and was subsequently SMTP-checked
> as being OK) :)
>
> atext="[A-Za-z0-9!#\$%&'*+/=?^_\`{|}~-]"
> dot_atom_text="$atext+(\\.$atext+)*"

RFC822 had that same dor rule, but I've never encountered mail
software that would reject an address with a final dot, eg
something like: <john.johnson.jr.@jrsite.example>

> dot_atom=$dot_atom_text
>
> if [ -z "`echo "$i" | egrep -e "${dot_atom}@${dot_atom}"`" ]; then

Hostnames have a pretty rigid syntax that can easily be checked.
Even punycoded international domain names will validate with this:

labeltext="[A-Za-z0-9][A-Za-z0-9-]{,62}"
hosttext="$labeltext(\\.$labeltext)*\\.?"

(I ignore the overall hostname length limit there. I'm pretty sure
it is 255 with dots.)

Trailing dots are used to anchor a hostname, legal but rarely seen.
If you owned a TLD and wanted to use it directly for email, the
trailing anchor dot would be a good idea: <vanity@mytld.>

With wildcard DNS, you can have some resolvers get an IP address
for invalid names, most often this seems to happen with underscores
used instead of hyphens. I figure those people are screwed and don't
worry about them.

> It is a shell script, but you can probably see how it works. It is derived
> directly from RFC 2822, section 3.4.1, which was current at the time it was
> written. Do I need to update it to RFC 5322 in any way?

I haven't looked at that closely enough to tell you.

Elijah
------
non-punycoded international domains are just for show
Re: What this means "(\w|-)+@\w"? [message #174790 is a reply to message #174784] Fri, 08 July 2011 06:42 Go to previous messageGo to next message
alvaro.NOSPAMTHANX is currently offline  alvaro.NOSPAMTHANX
Messages: 277
Registered: September 2010
Karma: 0
Senior Member
El 07/07/2011 15:58, zhang yun escribió/wrote:
> I have some problems in regular expression, here is code :
> function is_email($e_name)
> {
> $arr = array("ac","com","net","org","edu","gov","mil","ac\.cn",
> "com\.cn","edu\.cn","net\.cn","org\.cn");
> $str = implode("|",$arr);
> $reg ='/^[0-9a-zA-z](\w|-)+@\w+\.('.$str.')$/';
> //echo $reg;
> if(preg_match($reg,strtolower($e_name)))
> return strtolower($e_name);
> else
> return false;
> }

As already mentioned, it's a terrible way to validate e-mails. Among
other omissions, it discards like a hundred valid top level domains:

http://en.wikipedia.org/wiki/List_of_Internet_TLDs

If you are interested in the subject, I suggest you read this forum entry:

http://stackoverflow.com/questions/2514810/php-email-validation-question/25 15058#2515058

I'd say the best options is to write your own simple and permissive
expression, although you can also use a reliable third-party library and
keep it updated. (I found
http://code.google.com/p/php-email-address-validation/ but I have no
references about it.)


> I don't know the exactly meaning of "(\w|-)+@\w", could you help me
> figure out its meaning? I'm new to posix standard.

Er... Sorry, this is a Perl-compatible regular expression. The Posix
regexp functions (ereg, eregi...) are deprecated in PHP.



--
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--
Re: What this means "(\w|-)+@\w"? [message #174995 is a reply to message #174790] Tue, 02 August 2011 08:06 Go to previous messageGo to next message
Sandman is currently offline  Sandman
Messages: 32
Registered: August 2011
Karma: 0
Member
In article <iv68tf$fla$1(at)dont-email(dot)me>,
"Álvaro G. Vicario" <alvaro(dot)NOSPAMTHANX(at)demogracia(dot)com(dot)invalid>
wrote:

>> I have some problems in regular expression, here is code :
>> function is_email($e_name)
>> {
>> $arr = array("ac","com","net","org","edu","gov","mil","ac\.cn",
>> "com\.cn","edu\.cn","net\.cn","org\.cn");
>> $str = implode("|",$arr);
>> $reg ='/^[0-9a-zA-z](\w|-)+@\w+\.('.$str.')$/';
>> //echo $reg;
>> if(preg_match($reg,strtolower($e_name)))
>> return strtolower($e_name);
>> else
>> return false;
>> }
>
> As already mentioned, it's a terrible way to validate e-mails. Among
> other omissions, it discards like a hundred valid top level domains:
>
> http://en.wikipedia.org/wiki/List_of_Internet_TLDs
>
> If you are interested in the subject, I suggest you read this forum entry:
>
> http://stackoverflow.com/questions/2514810/php-email-validation-question/25 150
> 58#2515058
>
> I'd say the best options is to write your own simple and permissive
> expression, although you can also use a reliable third-party library and
> keep it updated. (I found
> http://code.google.com/p/php-email-address-validation/ but I have no
> references about it.)

What's wrong with the PHP built in email validator?

<?
if (filter_var($email, FILTER_VALIDATE_EMAIL)){
# Do fancy stuff, such as sending fancy email!
}
?>


--
Sandman[.net]
Re: What this means "(\w|-)+@\w"? [message #174998 is a reply to message #174995] Tue, 02 August 2011 09:50 Go to previous messageGo to next message
alvaro.NOSPAMTHANX is currently offline  alvaro.NOSPAMTHANX
Messages: 277
Registered: September 2010
Karma: 0
Senior Member
El 02/08/2011 10:06, Sandman escribió/wrote:
> What's wrong with the PHP built in email validator?
>
> <?
> if (filter_var($email, FILTER_VALIDATE_EMAIL)){
> # Do fancy stuff, such as sending fancy email!
> }
> ?>

For instance:

var_dump(filter_var('webmaster@álvaro.es', FILTER_VALIDATE_EMAIL));
var_dump(filter_var('webmaster(at)asxn--lvaro-wqa(dot)es', FILTER_VALIDATE_EMAIL));

bool(false)
bool(false)



--
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--
Re: What this means "(\w|-)+@\w"? [message #175001 is a reply to message #174998] Tue, 02 August 2011 12:56 Go to previous messageGo to next message
Peter H. Coffin is currently offline  Peter H. Coffin
Messages: 245
Registered: September 2010
Karma: 0
Senior Member
On Tue, 02 Aug 2011 11:50:09 +0200, ?lvaro G. Vicario wrote:
> El 02/08/2011 10:06, Sandman escribi?/wrote:
>> What's wrong with the PHP built in email validator?
>>
>> <?
>> if (filter_var($email, FILTER_VALIDATE_EMAIL)){
>> # Do fancy stuff, such as sending fancy email!
>> }
>> ?>
>
> For instance:
>
> var_dump(filter_var('webmaster@?lvaro.es', FILTER_VALIDATE_EMAIL));
> var_dump(filter_var('webmaster(at)asxn--lvaro-wqa(dot)es', FILTER_VALIDATE_EMAIL));
>
> bool(false)
> bool(false)

Several years ago, I wrote an email validator that worked fairly will,
and handled address cases that hadn't even been thought of yet. It used
telnet to fake being a MTA to the MX of record for the domain through
enough steps to ask if the receiving MTA would actually accept mail for
the address, but stopping short of actually sending anything.

It was, however, really slow, to start up. Once it had everything cached
up, it could average a couple of email addresses being validated per
second, but the first few hundred took a half hour or so.

I wonder what happened to that code....

--
The plural of datum is not "facts".
A collection of facts is not "knowledge".
Re: What this means "(\w|-)+@\w"? [message #175007 is a reply to message #175001] Tue, 02 August 2011 20:53 Go to previous messageGo to next message
Eli the Bearded is currently offline  Eli the Bearded
Messages: 22
Registered: April 2011
Karma: 0
Junior Member
In comp.lang.php, Peter H. Coffin <hellsop(at)ninehells(dot)com> wrote:
> Several years ago, I wrote an email validator that worked fairly will,
> and handled address cases that hadn't even been thought of yet. It used
> telnet to fake being a MTA to the MX of record for the domain through
> enough steps to ask if the receiving MTA would actually accept mail for
> the address, but stopping short of actually sending anything.

How did it handle temporary errors? Mail is supposed to be retried,
usually hours later. One common anti-spam technique is to exploit
that spammers often do not retry and to issue temporary "failure"
responses to every newly seen (sender,recipent) pair.

Elijah
------
not sure how often spammers get caught by that anymore
Re: What this means "(\w|-)+@\w"? [message #175008 is a reply to message #174998] Tue, 02 August 2011 22:21 Go to previous messageGo to next message
Sandman is currently offline  Sandman
Messages: 32
Registered: August 2011
Karma: 0
Member
In article <j18h8j$eg3$1(at)dont-email(dot)me>,
"Álvaro G. Vicario" <alvaro(dot)NOSPAMTHANX(at)demogracia(dot)com(dot)invalid>
wrote:

> El 02/08/2011 10:06, Sandman escribió/wrote:
>> What's wrong with the PHP built in email validator?
>>
>> <?
>> if (filter_var($email, FILTER_VALIDATE_EMAIL)){
>> # Do fancy stuff, such as sending fancy email!
>> }
>> ?>
>
> For instance:
>
> var_dump(filter_var('webmaster@álvaro.es', FILTER_VALIDATE_EMAIL));
> var_dump(filter_var('webmaster(at)asxn--lvaro-wqa(dot)es', FILTER_VALIDATE_EMAIL));
>
> bool(false)
> bool(false)

Is SMTP supposed to support IDN? And even if it's supposed to, what
SMTP servers correctly parse IDN? I mean, percentage-wise?


--
Sandman[.net]
Re: What this means "(\w|-)+@\w"? [message #175010 is a reply to message #175008] Wed, 03 August 2011 07:45 Go to previous messageGo to next message
Eli the Bearded is currently offline  Eli the Bearded
Messages: 22
Registered: April 2011
Karma: 0
Junior Member
In comp.lang.php, Sandman <mr(at)sandman(dot)net> wrote:
> Is SMTP supposed to support IDN? And even if it's supposed to, what
> SMTP servers correctly parse IDN? I mean, percentage-wise?

IDN exists above the SMTP layer. An international domain name
supporting client is supposed to convert the name to and from
punycode, which is highbit text encoded in plain ASCII
alphanumerics. The punycode name is valid at for all email
purposes.

In this model, a user entering an email address with an IDN
host into a form should be allowed. But the web site needs to
convert that -- as needed -- to the traditional DNS form.

I know I've never done that in my code. I should probably fix
that.

Blog post from 2009 lamenting this:
http://www.circleid.com/posts/20091120_idn_and_email_the_harsh_reality/

Elijah
------
perhaps this is finally the excuse to get a vanity IDN hostname
Re: What this means "(\w|-)+@\w"? [message #175011 is a reply to message #175010] Wed, 03 August 2011 08:02 Go to previous messageGo to next message
Sandman is currently offline  Sandman
Messages: 32
Registered: August 2011
Karma: 0
Member
In article <eli$1108030345(at)qz(dot)little-neck(dot)ny(dot)us>,
Eli the Bearded <*@eli.users.panix.com> wrote:

> In comp.lang.php, Sandman <mr(at)sandman(dot)net> wrote:
>> Is SMTP supposed to support IDN? And even if it's supposed to, what
>> SMTP servers correctly parse IDN? I mean, percentage-wise?
>
> IDN exists above the SMTP layer. An international domain name
> supporting client is supposed to convert the name to and from
> punycode, which is highbit text encoded in plain ASCII
> alphanumerics. The punycode name is valid at for all email
> purposes.
>
> In this model, a user entering an email address with an IDN
> host into a form should be allowed. But the web site needs to
> convert that -- as needed -- to the traditional DNS form.
>
> I know I've never done that in my code. I should probably fix
> that.

Yes, I know it is above the smtp layer, but a SMTP surely shouldn't
deliver mail to a client with IDN as punycode? But I suppose the
punycode could be only in the transmission layer with the "To:" header
intact as IDN.

I own a couple of IDN, but I never accept mail to them though. Perhaps
I should. :)

> Elijah
> ------
> perhaps this is finally the excuse to get a vanity IDN hostname

Or Emoji:

http://www.panic.com/blog/2011/07/the-worlds-first-emoji-domain/

:-D


--
Sandman[.net]
Re: What this means "(\w|-)+@\w"? [message #175012 is a reply to message #175001] Wed, 03 August 2011 08:55 Go to previous messageGo to next message
alvaro.NOSPAMTHANX is currently offline  alvaro.NOSPAMTHANX
Messages: 277
Registered: September 2010
Karma: 0
Senior Member
El 02/08/2011 14:56, Peter H. Coffin escribió/wrote:
> Several years ago, I wrote an email validator that worked fairly will,
> and handled address cases that hadn't even been thought of yet. It used
> telnet to fake being a MTA to the MX of record for the domain through
> enough steps to ask if the receiving MTA would actually accept mail for
> the address, but stopping short of actually sending anything.
>
> It was, however, really slow, to start up. Once it had everything cached
> up, it could average a couple of email addresses being validated per
> second, but the first few hundred took a half hour or so.
>
> I wonder what happened to that code....

I've never understood the obsession for writing e-mail validation
routines that not only check for syntax errors but try to find out
whether the mailbox actually exists. Nobody would install a modem on the
server so it can be used to validate phone numbers...

Whatever, considering it as proof of concept, your code was a nice
exercise. Too bad it could not detect typos when the mistyped address
exists as well.


--
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--
Re: What this means "(\w|-)+@\w"? [message #175016 is a reply to message #175007] Wed, 03 August 2011 13:32 Go to previous messageGo to next message
Peter H. Coffin is currently offline  Peter H. Coffin
Messages: 245
Registered: September 2010
Karma: 0
Senior Member
On Tue, 2 Aug 2011 20:53:00 +0000 (UTC), Eli the Bearded wrote:
> In comp.lang.php, Peter H. Coffin <hellsop(at)ninehells(dot)com> wrote:
>> Several years ago, I wrote an email validator that worked fairly will,
>> and handled address cases that hadn't even been thought of yet. It used
>> telnet to fake being a MTA to the MX of record for the domain through
>> enough steps to ask if the receiving MTA would actually accept mail for
>> the address, but stopping short of actually sending anything.
>
> How did it handle temporary errors? Mail is supposed to be retried,
> usually hours later. One common anti-spam technique is to exploit
> that spammers often do not retry and to issue temporary "failure"
> responses to every newly seen (sender,recipent) pair.

Temporary errors were good enough to pass for the "validation of email"
purposes, mostly. There were still some gaps, like "dns responsible for
this domain is not answering (and will never be fixed)" getting treated
as okay, or the MTA being used later having its own out-of-standard
limitations, but the number of false "okay" errors was pretty acceptable
in a few hundred thousand checks, and the false failures was so small
that I never heard anyone talk about any. And in any large collection of
email addresses, you're going to lose about 1% per month dues to domains
being sold, not renewed, accounts closing, people deciding that they've
been getting too much spam and abandoning the account, etc, and the
false okays get lost in that quickly.

--
Premature optimization is the root of all evil.
-- Sir Tony Hoare
Re: What this means "(\w|-)+@\w"? [message #175017 is a reply to message #175012] Wed, 03 August 2011 13:46 Go to previous messageGo to next message
Peter H. Coffin is currently offline  Peter H. Coffin
Messages: 245
Registered: September 2010
Karma: 0
Senior Member
On Wed, 03 Aug 2011 10:55:32 +0200, ?lvaro G. Vicario wrote:

> El 02/08/2011 14:56, Peter H. Coffin escribi?/wrote:
>
>> Several years ago, I wrote an email validator that worked fairly
>> will, and handled address cases that hadn't even been thought of yet.
>> It used telnet to fake being a MTA to the MX of record for the domain
>> through enough steps to ask if the receiving MTA would actually
>> accept mail for the address, but stopping short of actually sending
>> anything.
>>
>> It was, however, really slow, to start up. Once it had everything
>> cached up, it could average a couple of email addresses being
>> validated per second, but the first few hundred took a half hour or
>> so.
>>
>> I wonder what happened to that code....
>
> I've never understood the obsession for writing e-mail validation
> routines that not only check for syntax errors but try to find out
> whether the mailbox actually exists. Nobody would install a modem on
> the server so it can be used to validate phone numbers...

Heh. In North America, there's a comparable data file to this kind of
thing available from nanpa.org that (while not getting down to the
actual phone number level) will tell you whether a given combination
of area code and exchange (the M and N parts of +1 MMM-NNN-XXXX) are
valid, in service, on hold, pending allocation, and who runs them so
you can get a good idea of whether they're a land line, a mobile, a
specialty service, and sometimes even a stab at whether it's a business
or residence. EG: An owning organization of "AT&T BUSINESS SVCS" is
probably not someone's residence, and "ACS WIRELESS DBA ALL-TEL" is
probably a mobile. So, yeah, you CAN do it with phone numbers and it
doesn't even require a modem.

> Whatever, considering it as proof of concept, your code was a nice
> exercise. Too bad it could not detect typos when the mistyped address
> exists as well.

Not if the typo led to another valid address, no. If the typo led to the
mail failing, it would be caught. The "fix commonly mistyped domains"
part never got written, either. The idea behind that is to process only
the failures and remap to big services (yaho.com to yahoo.com, etc.),
then re-test.

--
"Only Irish coffee provides in a single glass all four essential food
groups: alcohol, caffeine, sugar, and fat."
-Alex Levine
Re: What this means "(\w|-)+@\w"? [message #175035 is a reply to message #175011] Fri, 05 August 2011 00:05 Go to previous message
Eli the Bearded is currently offline  Eli the Bearded
Messages: 22
Registered: April 2011
Karma: 0
Junior Member
In comp.lang.php, Sandman <mr(at)sandman(dot)net> wrote:
> Yes, I know it is above the smtp layer, but a SMTP surely shouldn't
> deliver mail to a client with IDN as punycode? But I suppose the
> punycode could be only in the transmission layer with the "To:" header
> intact as IDN.

Do you know the difference between envelope address and header address?
The headers get the IDN, the envelope gets the punycode.

> Or Emoji:
> http://www.panic.com/blog/2011/07/the-worlds-first-emoji-domain/
> :-D

'PILE OF POO' (U+1F4A9). Hmmm. Not my first choice of domain.

Elijah
------
better choices: CHERRIES (U+1F352), PINEAPPLE (U+1F34D), LOVE HOTEL (U+1F3E9)
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: upgrade of php & split deprecated = HELP
Next Topic: Re: What is Islam?
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Fri Nov 22 15:47:17 GMT 2024

Total time taken to generate the page: 0.02517 seconds