[Zend PHP5 Cerification] Lectures -- 2. String (& Regular Expressions)
Stringsare the most commonly used variable type in PHP, because they areboth central to web development, and the common method of datatransmission from the user.
Withinstrings many characters take a special meaning, matching quotes, backslashes, octal numbers, variables, if you wish to accept them as theyare you must escape them with the \ character.
PHPAccepts several escape sequences:
– \nlinefeed (LF or 0x0A (10) in ASCII)
– \rcarriage return (CR or 0x0D (13) in ASCII)
– \thorizontal tab (HT or 0x09 (9) in ASCII)
– \\backslash
– \$dollar sign
– \"double-quote
–\[0-7]{1,3} thesequence of characters matching the regular expression is a characterin octal notation.
$string= "I am the very model of the modern major general";
echo$string[5]; //t, the t from the
echo$string[0]; //I
echo$string{2}; //a, the a from am
exit;
The {}syntax is depreciated in PHP 5.1 and will produce a warning underE_STRICT
StringVariable parsing: This is important. Several exam questions I tookinvolved this.
Readcarefully on
http://cn.php.net/manual/en/language.types.string.php#language.types.string.parsing
Formatting
number_format()– By default formats a number with
thecomma as the thousands separator, and no
decimal
echonumber_format("1234567.89");
//1,234,568
echonumber_format("9876543.698", 3, ",", " ");
//Shows 9 876 543,698
money_format()
Onlyavailable when the underlying c library strfmon() is available (notwindows).
$number= 1234.56;
setlocale(LC_MONETARY,'en_US');
echomoney_format('%i', $number) . "\n";
//USD 1,234.56
setlocale(LC_MONETARY,'it_IT');
echomoney_format('%.2n', $number) . "\n";
//L. 1.234,56
Someother string functions may need familiar with:
intstrlen ( string $string )
stringstrtr ( string $str, string $from, string $to )
stringstrtr ( string $str, array $replace_pairs )
intstrcmp ( string $str1, string $str2 )
intstrcasecmp ( string $str1, string $str2 )
Note:PHP first transparently converts strings to their numeric equivalent;use "===" in the IF logic
intstrpos ( string $haystack, mixed $needle [, int $offset] )
intstripos ( string $haystack, string $needle [, int $offset] )
intstrrpos ( string $haystack, string $needle [, int $offset] )
stringstrstr ( string $haystack, string $needle )
stringstristr ( string $haystack, string $needle )
intstrspn ( string $str1, string $str2 [, int $start [, int $length]] ) whitelist
intstrcspn ( string $str1, string $str2 [, int $start [, int $length]] ) blacklist
mixedstr_replace ( mixed $search, mixed $replace, mixed $subject [, int&$count] )
mixedstr_ireplace ( mixed $search, mixed $replace, mixed $subject [, int&$count] )
mixedsubstr_replace ( mixed $string, string $replacement, int $start [,int $length] )
Theresult string is returned. If string is an array then array isreturned
stringsubstr ( string $string, int $start [, int $length] )
stringnumber_format ( float $number [, int $decimals [, string $dec_point,string $thousands_sep]] ) not locale-aware
stringmoney_format ( string $format, float $number ) locale aware
Themoney_format() function is not available onWindows, as well as onsome variants of UNIX.
stringsetlocale ( int $category, string $locale [, string $...] )
stringsetlocale ( int $category, array $locale )
PrintfFamily print(), sprintf(), vprintf(), sscanf(), fscanf()
------------------------------------------------------------------------------------------------------
intprintf ( string $format [, mixed $args [, mixed $...]] )
commonly-usedtype specifiers:
%- a literal percent character. No argument is required.
b- the argument is treated as an integer, and presented as a binarynumber.
c- the argument is treated as an integer, and presented as thecharacter with that ASCII value.
d- the argument is treated as an integer, and presented as a (signed)decimal number.
e- the argument is treated as scientific notation (e.g. 1.2e+2).
u- the argument is treated as an integer, and presented as an unsigneddecimal number.
f- the argument is treated as a float, and presented as afloating-point number (locale aware).
F- the argument is treated as a float, and presented as afloating-point number (non-locale aware). Available since PHP 4.3.10and PHP 5.0.3.
o- the argument is treated as an integer, and presented as an octalnumber.
s- the argument is treated as and presented as a string.
x- the argument is treated as an integer and presented as ahexadecimal number (with lowercase letters).
X- the argument is treated as an integer and presented as ahexadecimal number (with uppercase letters).
RegularExpression: POSIX Extended and PCRE
PCRE
PerlCompatible Regular Expressions, PHP’s most popular RegEx engine.
Technicallyyou can turn it off, no one does.
Exceedinglypowerful, not always the fastest choice, if you can use a built infunction do so.
Question:
Ifregular expressions must be used, in general which type of regularexpression functions available to PHP is preferred for performancereasons?
Toanswer above question, someone left me a comment on my blog:
Ithink the answer is: preg_*
Ifound it in php manual:
Note:preg_match(), which uses a Perl-compatible regular expression syntax,is often a faster alternative to ereg().
Perl-compatibleRegular Expressions
Delimiters By convention, the forward slash is used
Metacharacters
\dDigits 0-9
\DAnything not a digit
\wAny alphanumeric character or an underscore (_)
\WAnything not an alphanumeric character or an underscore
\sAny whitespace (spaces, tabs, newlines)
\SAny non-whitespace character
.Any character except for a newline
Quantifiers
?Occurs 0 or 1 time
*Occurs 0 or more times
+Occurs 1 or more times
{n}Occurs exactly n times
{,n}Occurs at most n times
{m,}Occurs m or more times
{m,n}Occurs between m and n times
Sub-Expressions compare this to the "range" e.g. [1-9]
PCRE– Pattern Modifiers
– i– Case insensitive search
– m– Multiline, $ and ^ will match at newlines
– s– Makes the dot metacharacter match newlines
– x– Allows for commenting
– U– Makes the engine un-greedy
– u– Turns on UTF8 support
– e– Matched with preg_replace() allows you to call
Functions:
intpreg_match ( string $pattern, string $subject [, array &$matches[, int $flags [, int $offset]]] )
Ifmatches is provided, then it is filled with the results of search.$matches[0] will contain the text that matched the full pattern,
$matches[1]will have the text that matched the first captured parenthesizedsubpattern, and so on.
intpreg_match_all ( string $pattern, string $subject, array &$matches[, int $flags [, int $offset]] )
mixedpreg_replace ( mixed $pattern, mixed $replacement, mixed $subject [,int $limit [, int &$count]] )
toreuse captured subpatterns directly in the substitution string byprefixing their index with a dollar sign
Just like with str_replace(), we can pass arrays of search and replacementarguments; however, unlike str_replace(), we can also pass in anarray of subjects on which to perform the search-and-replaceoperation. This can speed things up considerably, since the regularexpression (or expressions) are compiled once and reused multipletimes.