Reno, your email validation regex is still invalid. Email addresses can contain the "+" in the localpart.
i.e. david+something@domain.com
preg_match
(PHP 4, PHP 5)
preg_match — Riconoscimento con espressioni regolari
Descrizione
Esegue un riconoscimento nel parametro testo utilizzando l'espressione regolare indicata in criterio .
Se viene fornito il terzo parametro, testi_riconosciuti , questo verrà valorizzato con i risultati della ricerca. In dettaglio $testi_riconosciuti[0] conterrà il testo che si incrocia con l'intero criterio di ricerca, $testi_riconosciuti[1] conterrà il testo che soddisfa il primo criterio posto tra parentesi, $testi_riconosciuti[2] il secondo e così via.
Il parametro flags può assumere i seguenti valori:
- PREG_OFFSET_CAPTURE
- Se viene impostato questo flag, per ogni testo riconosciuto viene restituito l'offset della stringa. Occorre notare che questo cambia il tipo di valore restituito nell'array, infatti ogni elemento è, a sua volta, un'array composto dalla stringa riconosciuta, all'indice 0, e dall'offset della stringa nell'indice 1. Questa costante è disponibile a partire dalla versione 4.3.0 di PHP.
Normalemente la ricerca parte dall'inizio della stringa oggetto di ricerca. Con il parametro opzionale offset si può specificare da dove cominciare la ricerca. Equivale a passare substr()($testo, $offset) alla funzione preg_match() al posto del parametro testo. Il parametro offset è disponibile a partire dalla versione 4.3.3 di PHP.
La funzione preg_match() restituisce il numero di volte in cui è avvenuto il riconoscimento del criterio . Questo può essere 0 (nessun riconoscimento) oppure 1 se preg_match() si ferma dopo il primo riconoscimento. In condizioni normali, preg_match_all() continua il riconoscimento fino alla fine del parametro testo . preg_match() restituirà FALSE se si verifica un errore.
Non utilizzare la funzione preg_match() se si desidera controllare se una stringa è contenuta in un'altra. Piuttosto utilizzare strpos() oppure strstr() che sono più veloci.
Example #1 Ricerca del testo "php"
<?php
// La lettera "i" dopo i delimitatori indica una ricerca case-insensitive
if (preg_match("/php/i", "PHP è il linguaggio scelto.")) {
echo "Il riconoscimento è avvenuto.";
} else {
echo "Testo non riconosciuto.";
}
?>
Example #2 Cerca la parola "web"
<?php
// La lettera \b nel criterio indica i limiti della parola. Così verrà riconosciuta solo
// la parola "web" e non parte di una parola più lunga come "webbing" oppure "cobweb"
if (preg_match("/\bweb\b/i", "PHP è un linguaggio di programmazione per il web scelto da molti.")) {
echo "Il riconoscimento è avvenuto.";
} else {
echo "Testo non riconosciuto.";
}
if (preg_match("/\bweb\b/i", "PHP è un linguaggio di programmazione installato su molti website")) {
echo "Il riconoscimento è avvenuto.";
} else {
echo "Testo non riconosciuto.";
}
?>
Example #3 Estrapolazione del dominio da un URL
<?php
// come ottenere il nome dell'host da un URL
preg_match("/^(http:\/\/)?([^\/]+)/i",
"http://www.php.net/index.html", $matches);
$NomeHost = $matches[2];
// come ottenere gli ultimi due segmenti del nome dell'host
preg_match("/[^\.\/]+\.[^\.\/]+$/",$NomeHost,$matches);
echo "Nome del dominio: {$matches[0]}\n";
?>
L'esempio visualizzerà:
Nome del dominio: php.net
Vedere anche preg_match_all(), preg_replace() e preg_split().
preg_match
18-May-2009 03:06
08-May-2009 10:07
To support large Unicode ranges (ie: [\x{E000}-\x{FFFD}] or \x{10FFFFF}) you must use the modifier '/u' at the end of your expression.
03-May-2009 03:09
Html tags delete using regular expression
<?php
function removeHtmlTagsWithExceptions($html, $exceptions = null){
if(is_array($exceptions) && !empty($exceptions))
{
foreach($exceptions as $exception)
{
$openTagPattern = '/<(' . $exception . ')(\s.*?)?>/msi';
$closeTagPattern = '/<\/(' . $exception . ')>/msi';
$html = preg_replace(
array($openTagPattern, $closeTagPattern),
array('||l|\1\2|r||', '||l|/\1|r||'),
$html
);
}
}
$html = preg_replace('/<.*?>/msi', '', $html);
if(is_array($exceptions))
{
$html = str_replace('||l|', '<', $html);
$html = str_replace('|r||', '>', $html);
}
return $html;
}
// example:
print removeHtmlTagsWithExceptions(<<<EOF
<h1>Whatsup?!</h1>
Enjoy <span style="text-color:blue;">that</span> script<br />
<br />
EOF,
array('br'));
?>
28-Apr-2009 04:53
With regards to the bug report for preg_match which leads to segfault errors in some cases, the solution is pretty simple. Just split the string into smaller ones. E.g. with my xampp test server a length of 5000 is ok, but with 10000 chars it fails.
Just see the example which solves an encoding problem in adddition to the preg_match bug: http://mobile-website.mobi/php-utf8-vs-iso-8859-1-59
25-Apr-2009 05:52
I see a lot of people trying to put together phone regex's and struggling (hey, no worries...they're complicated). Here's one that we use that's pretty nifty. It's not perfect, but it should work for most non-idealists.
*** Note: Only matches U.S. phone numbers. ***
<?php
// all on one line...
$regex = '/^(?:1(?:[. -])?)?(?:\((?=\d{3}\)))?([2-9]\d{2})(?:(?<=\(\d{3})\))? ?(?:(?<=\d{3})[.-])?([2-9]\d{2})[. -]?(\d{4})(?: (?i:ext)\.? ?(\d{1,5}))?$/';
// or broken up
$regex = '/^(?:1(?:[. -])?)?(?:\((?=\d{3}\)))?([2-9]\d{2})'
.'(?:(?<=\(\d{3})\))? ?(?:(?<=\d{3})[.-])?([2-9]\d{2})'
.'[. -]?(\d{4})(?: (?i:ext)\.? ?(\d{1,5}))?$/';
?>
If you're wondering why all the non-capturing subpatterns (which look like this "(?:", it's so that we can do this:
<?php
$formatted = preg_replace($regex, '($1) $2-$3 ext. $4', $phoneNumber);
// or, provided you use the $matches argument in preg_match
$formatted = "($matches[1]) $matches[2]-$matches[3]";
if ($matches[4]) $formatted .= " $matches[4]";
?>
*** Results: ***
520-555-5542 :: MATCH
520.555.5542 :: MATCH
5205555542 :: MATCH
520 555 5542 :: MATCH
520) 555-5542 :: FAIL
(520 555-5542 :: FAIL
(520)555-5542 :: MATCH
(520) 555-5542 :: MATCH
(520) 555 5542 :: MATCH
520-555.5542 :: MATCH
520 555-0555 :: MATCH
(520)5555542 :: MATCH
520.555-4523 :: MATCH
19991114444 :: FAIL
19995554444 :: MATCH
514 555 1231 :: MATCH
1 555 555 5555 :: MATCH
1.555.555.5555 :: MATCH
1-555-555-5555 :: MATCH
520-555-5542 ext.123 :: MATCH
520.555.5542 EXT 123 :: MATCH
5205555542 Ext. 7712 :: MATCH
520 555 5542 ext 5 :: MATCH
520) 555-5542 :: FAIL
(520 555-5542 :: FAIL
(520)555-5542 ext .4 :: FAIL
(512) 555-1234 ext. 123 :: MATCH
1(555)555-5555 :: MATCH
07-Mar-2009 12:18
I just learned about named groups from a Python friend today and was curious if PHP supported them, guess what -- it does!!!
http://www.regular-expressions.info/named.html
<?php
preg_match("/(?P<foo>abc)(.*)(?P<bar>xyz)/",
'abcdefghijklmnopqrstuvwxyz',
$matches);
print_r($matches);
?>
will produce:
Array
(
[0] => abcdefghijklmnopqrstuvwxyz
[foo] => abc
[1] => abc
[2] => defghijklmnopqrstuvw
[bar] => xyz
[3] => xyz
)
Note that you actually get the named group as well as the numerical key
value too, so if you do use them, and you're counting array elements, be
aware that your array might be bigger than you initially expect it to be.
28-Feb-2009 12:16
I recently encountered a problem trying to capture multiple instances of named subpatterns from filenames.
Therefore, I came up with this function.
The function allows you to pass through flags (in this version it applies to all expressions tested), and generates an array of search results.
Enjoy!
<?php
/**
* Allows multiple expressions to be tested on one string.
* This will return a boolean, however you may want to alter this.
*
* @author William Jaspers, IV <wjaspers4@gmail.com>
* @created 2009-02-27 17:00:00 +6:00:00 GMT
* @access public
*
* @param array $patterns An array of expressions to be tested.
* @param String $subject The data to test.
* @param array $findings Optional argument to store our results.
* @param mixed $flags Pass-thru argument to allow normal flags to apply to all tested expressions.
* @param array $errors A storage bin for errors
*
* @returns bool Whether or not errors occurred.
*/
function preg_match_multiple(
array $patterns=array(),
$subject=null,
&$findings=array(),
$flags=false,
&$errors=array()
) {
foreach( $patterns as $name => $pattern )
{
if( 1 <= preg_match_all( $pattern, $subject, $found, $flags ) )
{
$findings[$name] = $found;
} else
{
if( PREG_NO_ERROR !== ( $code = preg_last_error() ))
{
$errors[$name] = $code;
} else $findings[$name] = array();
}
}
return (0===sizeof($errors));
}
?>
19-Feb-2009 03:41
here is a small tool for someone learning to use regular expressions. it's very basic, and allows you to try different patterns and combinations. I made it to help me, because I like to try different things, to get a good understanding of how things work.
<?php
$search = isset($_POST['search'])?$_POST['search']:"//";
$match = isset($_POST['match'])?$_POST['match']:"<>";
echo '<form method="post">';
echo 's: <input style="width:400px;" name="search" type="text" value="'.$search.'" /><br />';
echo 'm:<input style="width:400px;" name="match" type="text" value="'.$match.'" /><input type="submit" value="go" /></form><br />';
if (preg_match($search, $match)){echo "matches";}else{echo "no match";}
?>
10-Feb-2009 02:42
I have written a short introduction and a colorful cheat sheet for Perl Compatible Regular Expressions (PCRE):
http://www.bitcetera.com/en/techblog/2008/04/01/regex-in-a-nutshell/
30-Jan-2009 12:05
Bugs of preg_match (PHP-version 5.2.5)
In most cases, the following example will show one of two PHP-bugs discovered with preg_match depending on your PHP-version and configuration.
<?php
$text = "test=";
// creates a rather long text
for ($i = 0; $i++ < 100000;)
$text .= "%AB";
// a typical URL_query validity-checker (the pattern's function does not matter for this example)
$pattern = '/^(?:[;\/?:@&=+$,]|(?:[^\W_]|[-_.!~*\()\[\] ])|(?:%[\da-fA-F]{2}))*$/';
var_dump( preg_match( $pattern, $text ) );
?>
Possible bug (1):
=============
On one of our Linux-Servers the above example crashes PHP-execution with a C(?) Segmentation Fault(!). This seems to be a known bug (see http://bugs.php.net/bug.php?id=40909), but I don't know if it has been fixed, yet.
If you are looking for a work-around, the following code-snippet is what I found helpful. It wraps the possibly crashing preg_match call by decreasing the PCRE recursion limit in order to result in a Reg-Exp error instead of a PHP-crash.
<?php
[...]
// decrease the PCRE recursion limit for the (possibly dangerous) preg_match call
$former_recursion_limit = ini_set( "pcre.recursion_limit", 10000 );
// the wrapped preg_match call
$result = preg_match( $pattern, $text );
// reset the PCRE recursion limit to its original value
ini_set( "pcre.recursion_limit", $former_recursion_limit );
// if the reg-exp fails due to the decreased recursion limit we may not make any statement, but PHP-execution continues
if ( PREG_RECURSION_LIMIT_ERROR === preg_last_error() )
{
// react on the failed regular expression here
$result = [...];
// do logging or email-sending here
[...]
} //if
?>
Possible bug (2):
=============
On one of our Windows-Servers the above example does not crash PHP, but (directly) hits the recursion-limit. Here, the problem is that preg_match does not return boolean(false) as expected by the description / manual of above.
In short, preg_match seems to return an int(0) instead of the expected boolean(false) if the regular expression could not be executed due to the PCRE recursion-limit. So, if preg_match results in int(0) you seem to have to check preg_last_error() if maybe an error occurred.
06-Jan-2009 01:52
I modified your email validation pattern to solve these issues:
- the string MUST contain a TLD
- TLD can be 2 letters long as well as 3 or more (ie: .ca, .us, .uk, .fr, etc.)
- domain name (tld not included) must contain at least 2 characters
- domain name can contain "-"if it's not the first nor the last character.
<?php
$pattern = '/^([a-z0-9])(([-a-z0-9._])*([a-z0-9]))*\@([a-z0-9])' .
'(([a-z0-9-])*([a-z0-9]))+' . '(\.([a-z0-9])([-a-z0-9_-])?([a-z0-9])+)+$/i';
echo preg_match ($pattern, "email-address-to-validate@host.tld");
?>
25-Dec-2008 11:58
The above patterns are tested but for this type of
emails those get fails. This is most valid pattern.
<?php
/**
* Most corrected pattern for Email validation.
*
*/
// Valid email
echo preg_match('/^([a-z0-9])(([-a-z0-9._])*([a-z0-9]))*
\@([a-z0-9])*(\.([a-z0-9])([-a-z0-9_-])([a-z0-9])+)*$/i'
,'09_az..AZ@host.dOMain.cOM');
// Invalid emails
echo preg_match('/^([a-z0-9])(([-a-z0-9._])*([a-z0-9]))*
\@([a-z0-9])*(\.([a-z0-9])([-a-z0-9_-])([a-z0-9])+)*$/i'
,'09_azAZ@ho...st...........domain.com');
echo preg_match('/^([a-z0-9])(([-a-z0-9._])*([a-z0-9]))*
\@([a-z0-9])*(\.([a-z0-9])([-a-z0-9_-])([a-z0-9])+)*$/i'
,'09_azAZ@host.do@main.com');
?>
----------------------------
Output:
----------------------------
1 = valid
0 = invalid
0 = invalid
11-Dec-2008 03:15
If you need to check whether string is a serialized representation of variable(sic!) you can use this :
<?php
$string = "a:0:{}";
if(preg_match("/(a|O|s|b)\x3a[0-9]*?
((\x3a((\x7b?(.+)\x7d)|(\x22(.+)\x22\x3b)))|(\x3b))/", $string))
{
echo "Serialized.";
}
else
{
echo "Not serialized.";
}
?>
But don't forget, string in serialized representation could be VERY big,
so match work can be slow, even with fast preg_* functions.
01-Dec-2008 08:36
@Ben:
Your pattern will match 1.1.255.299 (it matches the .29 at the end out of subpattern .299)
This pattern eliminates such false positives:
/^((1?\d{1,2}|2[0-4]\d|25[0-5])\.){3}(1?\d{1,2}|2[0-4]\d|25[0-5]){1}$/
Ronen
21-Nov-2008 06:35
When I was using the above example's syntax for named capturing groups, it worked fine on my development server (PHP 5.2.6), but then gave me a regex error on the live server (PHP 5.0.4).
By adding a 'P' in front of the parameter name, it seems to have resolved the issue (this is in accordance w/ the PCRE implementation).
To use the above example, here's the original:
<?php
preg_match('/(?<name>\w+): (?<digit>\d+)/', $str, $matches);
?>
And here's the fix:
<?php
preg_match('/(?P<name>\w+): (?P<digit>\d+)/', $str, $matches);
?>
25-Oct-2008 08:47
Marc your pattern will match 259.259.259.259
I think you're actually after something like this:
/((1?\d{1,2}|2[0-4]\d|25[0-5])\.){3}(1?\d{1,2}|2[0-4]\d|25[0-5])/
23-Oct-2008 02:01
If you need to check for .com.br and .com.au and .uk and all the other crazy domain endings i found the following expression works well if you want to validate an email address. Its quite generous in what it will allow
<?php
$email_address = "phil.taylor@a_domain.tv";
if (preg_match("/^[^@]*@[^@]*\.[^@]*$/", $email_address)) {
return "E-mail address";
}
?>
16-Oct-2008 04:21
@ Marc
A little more work to do--your expression matched ...256... through ...259..., and will not match 1- or 2-digit numbers that do not start with 1. It could also be a little more concise, as in:
/^(1?\d{1,2}|2([0-4]\d|5[0-5]))(\.(1?\d{1,2}|2([0-4]\d|5[0-5]))){3}$/
Also, I put together a primitive regex tester at http://j-r.camenisch.net/regex/ -- to help someone find more flaws to correct. ;-)
06-Oct-2008 10:16
@ Steve Todorov:
Your regex will not only match 999.999... but also 9999.9999... etc.
I'd rather take this regex:
/^(1\d{0,2}|2(\d|[0-5]\d)?)\.(1\d{0,2}|2(\d|[0-5]\d)?)
\.(1\d{0,2}|2(\d|[0-5]\d)?)\.(1\d{0,2}|2(\d|[0-5]\d)?)$/
this should represent any ip (v4). At least it did in a small test here ;)
03-Oct-2008 03:23
While I was reading the preg_match documentation I didn't found how to match an IP..
Let's say you need to make a script that is working with ip/host and you want to show the hostname - not the IP.
Well this is the way to go:
<?php
/* This is an ip that is "GET"/"POST" from somewhere */
$ip = $_POST['ipOrHost'];
if(preg_match('/(\d+).(\d+).(\d+).(\d+)/',$ip))
$host = gethostbyaddr($ip);
else
$host = gethostbyname($ip);
echo $host;
?>
This is a really simple script made for beginners !
If you'd like you could add restriction to the numbers.
The code above will accept all kind of numbers and we know that IP address could be MAX 255.255.255.255 and the example accepts to 999.999.999.999.
Wish you luck!
Best wishes,
Steve
12-Sep-2008 05:18
If you need to match specific wildcards in IP address, you can use this regexp:
<?php
$ip = '10.1.66.22';
$cmp = '10.1.??.*';
$cnt = preg_match('/^'
.str_replace(
array('\*','\?'),
array('(.*?)','[0-9]'),
preg_quote($cmp)).'$/',
$ip);
echo $cnt;
?>
where '?' is exactly one digit and '*' is any number of any characters. $cmp mask can be provided wild by user, $cnt equals (int) 1 on match or 0.
28-Aug-2008 04:55
I found this rather useful for testing mutliple strings when developing a regex pattern.
<?php
/**
* Runs preg_match on an array of strings and returns a result set.
* @author wjaspers4[at]gmail[dot]com
* @param String $expr The expression to match against
* @param Array $batch The array of strings to test.
* @return Array
*/
function preg_match_batch( $expr, $batch=array() )
{
// create a placeholder for our results
$returnMe = array();
// for every string in our batch ...
foreach( $batch as $str )
{
// test it, and dump our findings into $found
preg_match($expr, $str, $found);
// append our findings to the placeholder
$returnMe[$str] = $found;
}
return $returnMe;
}
?>
10-Aug-2008 11:12
For validation of email addresses, Cal Henderson's RFC 822 and RFC 2822 is_valid_email() functions rule all:
http://code.iamcal.com/php/rfc822/
09-Jul-2008 01:11
preg_match and preg_replace_callback doesnt match up in the structure of the array that they fill-up for a match.
preg_match, as the example shows, supports named patterns, whereas preg_replace_callback doesnt seem to support it at all. It seem to ignore any named pattern matched.
08-Jul-2008 05:01
I made a mistake in my previous post. Mail addresses may of course only be "exotic" in their local parts, not in the domain part. Therefore, an exotic mail address would be "exotic#%$mail@domain.com".
07-Jul-2008 11:51
For those not so familiar with regex's, I post my algorithmic email validation routine. It can more easily be changed for individual needs than regex's. My function does NOT recognize exotic email addresses as allowed by RFC. (For example, info@exotic%&$#mail.com is a legal email address but not allowed by my function.)
-Tim
<?php
function email_is_valid($email) {
if (substr_count($email, '@') != 1)
return false;
if ($email{0} == '@')
return false;
if (substr_count($email, '.') < 1)
return false;
if (strpos($email, '..') !== false)
return false;
$length = strlen($email);
for ($i = 0; $i < $length; $i++) {
$c = $email{$i};
if ($c >= 'A' && $c <= 'Z')
continue;
if ($c >= 'a' && $c <= 'z')
continue;
if ($c >= '0' && $c <= '9')
continue;
if ($c == '@' || $c == '.' || $c == '_' || $c == '-')
continue;
return false;
}
$TLD = array (
'COM', 'NET',
'ORG', 'MIL',
'EDU', 'GOV',
'BIZ', 'NAME',
'MOBI', 'INFO',
'AERO', 'JOBS',
'MUSEUM'
);
$tld = strtoupper(substr($email, strrpos($email, '.') + 1));
if (strlen($tld) != 2 && !in_array($tld, $TLD))
return false;
return true;
}
?>
03-Jul-2008 11:30
The regexp below thinks that the e-mail address:
'me@de.com' is invalid, which it is not.
'/^([a-z0-9])(([-a-z0-9._])*([a-z0-9]))*\@
([a-z0-9])([-a-z0-9_])+([a-z0-9])*
(\.([a-z0-9])([-a-z0-9_-])([a-z0-9])+)*$/i'
I modified it and it seems to work for me in my limited tests of it.
YMMV.
26-Jun-2008 04:48
Paperweight, this pattern worked fine for me (even for intranet adresses, like "john@localhost"; and also for subdomain emails, like "john@foo.bar.com"):
'/([a-z0-9])([-a-z0-9._])+([a-z0-9])\@
([a-z0-9])([-a-z0-9_])+([a-z0-9])
(\.([a-z0-9])([-a-z0-9_-])([a-z0-9])+)*/i'
but, still, this won't replace the "activation link", that is the better way to check if an e-mail is valid or not.
26-May-2008 09:50
Because making a truly correct email validation function is harder than one may think, consider using this one which comes with PHP through the filter_var function (http://www.php.net/manual/en/function.filter-var.php):
<?php
$email = "someone@domain .local";
if(!filter_var($email, FILTER_VALIDATE_EMAIL)) {
echo "E-mail is not valid";
} else {
echo "E-mail is valid";
}
?>
04-Apr-2008 11:36
In addition to reiner-keller's comment about Umlaute using setlocale (LC_ALL, 'de_DE');
To enable 'de_DE' on my Debian 4 machine I first had to:
- uncomment 'de_DE' in file /etc/locale.gen and afterwards
- run locale-gen from the shell
