有不少php初學者截取字符都會使用substr()函數或者mb_substr()函數來截取了,第一個中文肯定亂碼了,第二個性能不好,下面我總結了幾個自定的中文字串截取無亂碼實例。
例1
function msubstr($str, $start=0, $length, $charset="utf-8", $suffix=true)
{
if(function_exists("mb_substr"))
return mb_substr($str, $start, $length, $charset);
elseif(function_exists('iconv_substr')) {
return iconv_substr($str,$start,$length,$charset);
}
$re['utf-8'] = "/[x01-x7f]|[xc2-xdf][x80-xbf]|[xe0-xef][x80-xbf]{2}|[xf0-xff][x80-xbf]{3}/";
$re['gb2312'] = "/[x01-x7f]|[xb0-xf7][xa0-xfe]/";
$re['gbk'] = "/[x01-x7f]|[x81-xfe][x40-xfe]/";
$re['big5'] = "/[x01-x7f]|[x81-xfe]([x40-x7e]|xa1-xfe])/";
preg_match_all($re[$charset], $str, $match);
$slice = join("",array_slice($match[0], $start, $length));
if($suffix) return $slice."…";
return $slice;
}
例2
代碼如下 復制代碼<?php
//$start:指定開始截取字符串的位置;$length指定截取字符的長度
function substr2($string, $start, $length)
{
$len = strlen($string);
if($len > $length)
{
$str = '';
$len1 = $start + $length; //截取到原字符串的位置
for($i=$start; $i<$len1; $i++)
{
if(ord(substr($string, $i, 2)) > 0xa0) //在ASCII中,0xa0表示漢字的開始
{
$str.=substr($string, $i, 2);
$i++;
}
else
{
$str.=substr($string, $i, 1);
}
}
return $str.'...';
}
else
{
return $string;
}
}
?>
再補充個簡單的,思路相同(2010-5-31)
代碼如下 復制代碼<?php
function chinesesubstr($str, $start, $len){
$strlen = $start + $len;
for($i=0; $i<$strlen; $i++){
if(ord(substr($str, $i, 1)) > 0xa0){
$tmpstr .= substr($str, $i, 2);
$i++;
}else{
$tmpstr .= substr($str, $i, 1);
}
}
return $tmpstr;
}
$str = "waiting for you 等wait你back";
echo chinesesubstr($str, 0, 19)
?>