[Zend PHP5 Cerification] Lectures -- 2. String (& Regular Expressions)

Stringsare the most commonly used variable type in PHP, because they areboth central to web development, and the common method of datatransmission from the user.

 

Withinstrings many characters take a special meaning, matching quotes, backslashes, octal numbers, variables, if you wish to accept them as theyare you must escape them with the \ character.

PHPAccepts several escape sequences:

\nlinefeed (LF or 0x0A (10) in ASCII)

\rcarriage return (CR or 0x0D (13) in ASCII)

\thorizontal tab (HT or 0x09 (9) in ASCII)

\\backslash

\$dollar sign

\"double-quote

\[0-7]{1,3} thesequence of characters matching the regular expression is a characterin octal notation.

 

$string= "I am the very model of the modern major general";

echo$string[5]; //t, the t from the

echo$string[0]; //I

echo$string{2}; //a, the a from am

exit;

 

The {}syntax is depreciated in PHP 5.1 and will produce a warning underE_STRICT

 

StringVariable parsing: This is important. Several exam questions I tookinvolved this.

Readcarefully on

http://cn.php.net/manual/en/language.types.string.php#language.types.string.parsing

 

Formatting

number_format()– By default formats a number with

thecomma as the thousands separator, and no

decimal

echonumber_format("1234567.89");

//1,234,568

echonumber_format("9876543.698", 3, ",", " ");

//Shows 9 876 543,698

 

money_format()

Onlyavailable when the underlying c library strfmon() is available (notwindows).

 

$number= 1234.56;

setlocale(LC_MONETARY,'en_US');

echomoney_format('%i', $number) . "\n";

//USD 1,234.56

setlocale(LC_MONETARY,'it_IT');

echomoney_format('%.2n', $number) . "\n";

//L. 1.234,56

 

 

Someother string functions may need familiar with:

intstrlen ( string $string )

 

stringstrtr ( string $str, string $from, string $to )

stringstrtr ( string $str, array $replace_pairs )

 

intstrcmp ( string $str1, string $str2 )

intstrcasecmp ( string $str1, string $str2 )

Note:PHP first transparently converts strings to their numeric equivalent;use "===" in the IF logic

 

intstrpos ( string $haystack, mixed $needle [, int $offset] )

intstripos ( string $haystack, string $needle [, int $offset] )

intstrrpos ( string $haystack, string $needle [, int $offset] )

 

stringstrstr ( string $haystack, string $needle )

stringstristr ( string $haystack, string $needle )

 

intstrspn ( string $str1, string $str2 [, int $start [, int $length]] ) whitelist

intstrcspn ( string $str1, string $str2 [, int $start [, int $length]] ) blacklist

 

mixedstr_replace ( mixed $search, mixed $replace, mixed $subject [, int&$count] )

mixedstr_ireplace ( mixed $search, mixed $replace, mixed $subject [, int&$count] )

mixedsubstr_replace ( mixed $string, string $replacement, int $start [,int $length] )

Theresult string is returned. If string is an array then array isreturned

 

stringsubstr ( string $string, int $start [, int $length] )

 

stringnumber_format ( float $number [, int $decimals [, string $dec_point,string $thousands_sep]] ) not locale-aware

stringmoney_format ( string $format, float $number ) locale aware

Themoney_format() function is not available onWindows, as well as onsome variants of UNIX.

stringsetlocale ( int $category, string $locale [, string $...] )

stringsetlocale ( int $category, array $locale )

 

PrintfFamily print(), sprintf(), vprintf(), sscanf(), fscanf()

------------------------------------------------------------------------------------------------------

intprintf ( string $format [, mixed $args [, mixed $...]] )

 

commonly-usedtype specifiers:

%- a literal percent character. No argument is required.

b- the argument is treated as an integer, and presented as a binarynumber.

c- the argument is treated as an integer, and presented as thecharacter with that ASCII value.

d- the argument is treated as an integer, and presented as a (signed)decimal number.

e- the argument is treated as scientific notation (e.g. 1.2e+2).

u- the argument is treated as an integer, and presented as an unsigneddecimal number.

f- the argument is treated as a float, and presented as afloating-point number (locale aware).

F- the argument is treated as a float, and presented as afloating-point number (non-locale aware). Available since PHP 4.3.10and PHP 5.0.3.

o- the argument is treated as an integer, and presented as an octalnumber.

s- the argument is treated as and presented as a string.

x- the argument is treated as an integer and presented as ahexadecimal number (with lowercase letters).

X- the argument is treated as an integer and presented as ahexadecimal number (with uppercase letters).

 

 

RegularExpression: POSIX Extended and PCRE

PCRE

PerlCompatible Regular Expressions, PHP’s most popular RegEx engine.

Technicallyyou can turn it off, no one does.

Exceedinglypowerful, not always the fastest choice, if you can use a built infunction do so.

 

Question:
Ifregular expressions must be used, in general which type of regularexpression functions available to PHP is preferred for performancereasons?

Toanswer above question, someone left me a comment on my blog:

 

Ithink the answer is: preg_*

Ifound it in php manual:

Note:preg_match(), which uses a Perl-compatible regular expression syntax,is often a faster alternative to ereg().

http://uk3.php.net/ereg

 

Perl-compatibleRegular Expressions

Delimiters By convention, the forward slash is used

Metacharacters

\dDigits 0-9

\DAnything not a digit

\wAny alphanumeric character or an underscore (_)

\WAnything not an alphanumeric character or an underscore

\sAny whitespace (spaces, tabs, newlines)

\SAny non-whitespace character

.Any character except for a newline

Quantifiers

?Occurs 0 or 1 time

*Occurs 0 or more times

+Occurs 1 or more times

{n}Occurs exactly n times

{,n}Occurs at most n times

{m,}Occurs m or more times

{m,n}Occurs between m and n times

Sub-Expressions compare this to the "range" e.g. [1-9]

PCRE– Pattern Modifiers

i– Case insensitive search

m– Multiline, $ and ^ will match at newlines

s– Makes the dot metacharacter match newlines

x– Allows for commenting

U– Makes the engine un-greedy

u– Turns on UTF8 support

e– Matched with preg_replace() allows you to call

 

Functions:

intpreg_match ( string $pattern, string $subject [, array &$matches[, int $flags [, int $offset]]] )

Ifmatches is provided, then it is filled with the results of search.$matches[0] will contain the text that matched the full pattern,

$matches[1]will have the text that matched the first captured parenthesizedsubpattern, and so on.

 

intpreg_match_all ( string $pattern, string $subject, array &$matches[, int $flags [, int $offset]] )

 

mixedpreg_replace ( mixed $pattern, mixed $replacement, mixed $subject [,int $limit [, int &$count]] )

toreuse captured subpatterns directly in the substitution string byprefixing their index with a dollar sign

 

Just like with str_replace(), we can pass arrays of search and replacementarguments; however, unlike str_replace(), we can also pass in anarray of subjects on which to perform the search-and-replaceoperation. This can speed things up considerably, since the regularexpression (or expressions) are compiled once and reused multipletimes.

 

posted @ 2010-06-29 17:22  DavidHHuan  阅读(527)  评论(0编辑  收藏  举报