[PHP] - 按长度截取字符串(针对UTF-8)

代码是看看规范又抄又改,希望是完美没错...

 

/*********************************************************************
 *
 * Func Name : SubStrEx
 *
 * Description :
 *     按长度截取字符串
 *
 *
 * Parameters :
 *     $sString[string] : 字符串
 *     $nLength[int] : 要取长度
 *     $sDot[string="..."] : 截断符
 *     $sCharset[string="UTF-8"] : 编码
 *
 * Returns : [bool]保存结果
 *
 *********************************************************************/
function SubStrEx ($sString, $nLength, $sDot = "...", $sCharset = "UTF-8") {
    if (strlen($sString) <= $nLength) {
        return $sString;
    }

    $sString = str_replace(array("&", "\"", "<", ">"), array("&", "\"", "<", ">"), $sString);

    $sStrcut = "";
    if (strtoupper($sCharset) == "UTF-8") {

        $nPos = 0;
        $nWordLen = 0;
        $noc = 0;  // 单字节字符+1,多字节+2(将非单字节字符的长度当做是2[应该是视觉感觉吧])
        while ($nPos < strlen($sString)) {

            $t = ord($sString[$nPos]);
            if ($t == 9 || $t == 10 || (32 <= $t && $t <= 126)) {
                $nWordLen = 1;
                $noc++;
            } elseif (194 <= $t && $t <= 223) {
                $nWordLen = 2;
                $noc += 2;
            } elseif (224 <= $t && $t <= 239) {
                $nWordLen = 3;
                $noc += 2;
            } elseif (240 <= $t && $t <= 247) {
                $nWordLen = 4;
                $noc += 2;
            } elseif (248 <= $t && $t <= 251) {
                $nWordLen = 5;
                $noc += 2;
            } elseif ($t == 252 || $t == 253) {
                $nWordLen = 6;
                $noc += 2;
            } else {
                $nWordLen = 0;
            }
            
            $nPos += ($nWordLen > 0) ? $nWordLen : 1;

            if ($noc >= $nLength) {
                break;
            }
        }
        if ($noc > $nLength) {
            $nPos -= $nWordLen;
        }
        $sStrcut = substr($sString, 0, $nPos);

    } else {
        for ($i = 0; $i < $nLength; $i++) {
            $sStrcut .= ord($sString[$i]) > 127 ? $sString[$i] . $sString[++ $i] : $sString[$i];
        }
    }
    
    $sStrcut = str_replace(array("&", "\"", "<", ">"), array("&", "\"", "<", ">"), $sStrcut);

    return $sStrcut . $sDot;
}
posted @ 2012-08-03 15:55  炎峰森林影  阅读(389)  评论(0编辑  收藏  举报