Re: Zip Codes ctype? Pregmatch? RESOLVED [message #184249 is a reply to message #184246] |
Mon, 16 December 2013 12:27 |
Doug Miller
Messages: 171 Registered: August 2011
Karma:
|
Senior Member |
|
|
Moon Elf <moonelf(at)moonelfsystem(dot)net> wrote in news:slrnlatjrr(dot)dlu(dot)moonelf(at)sigtrans(dot)org:
> On 2013-08-21, Twayne <nobody(at)spamcop(dot)net> wrote:
[...]
>> <?php
>> $country_code="US";
>> $zip_postal="11111";
>>
>> $ZIPREG=array(
>> "US"=>"^\d{5}([\-]?\d{4})?$",
[...]
>
> A faster algorithm would be to use regexes which use .+ .* with a unique
> fingerprint. The above code is grinding your system probably.
>
> I am sure tutorials such as Mastering Regular Expressions 2nd ed. would help
> out.
Never mind that -- the bigger problem is that it's just plain wrong. Comparing to a RegEx can
determine only if a postal code has the correct *format*, not if it is actually a valid code.
According to this ,
>
>> if ($ZIPREG[$country_code]) {
>>
>> if (!preg_match("/".$ZIPREG[$country_code]."/i",$zip_postal)){
>> //Validation failed, provided zip/postal code is not valid.
>> } else {
>> //Validation passed, provided zip/postal code is valid.
>> }
00000-0000 is a "valid" US postal code. It's not. It's correctly *formatted*, but its contents do
not correspond to a valid ZIP+4.
99999 also fails; it matches the RegEx, but is not a valid ZIP code. Same problem with
11111, 22222, 33333, and 54321 -- and thousands of others. Out of the 100,000 possible 5-
digit zip codes, less than 42,000 are actually in use, but this algorithm will say that all 100,000
of them are valid.
And since only 42K out of 100K 5-digit zip codes are actually in use, *at most* 420 million of
the one billion possible ZIP+4 codes can be valid, making *at least* 580 million *more*
invalid ZIP+4 codes that this algorithm will incorrectly declare to be valid.
It's pretty likely that similar problems exist for the other 11 nations as well.
|
|
|