方法基本上來自THinkphp中的源碼,但是被我修改了一下
復制代碼 代碼如下:
<?php
/*
*@Description:刪除HTML標簽,得到純文本。可以處理嵌套的標簽
*
*/
class deleteHtmlTags{
private $filename;
function __construct($filename='C:/AppServ/www/text.txt'){
$this->filename = $filename;
}
/**
* 刪除html標簽,得到純文本。可以處理嵌套的標簽,局限性在於連標簽內的屬性值都會刪除掉
* @access public
* @param string $string 要處理的html
* @return string
*/
public function deletehtmltags(){
$content = $this->contentGet();
while(strstr($content, '>')){
$currentBegin = strpos($content, '<');
$currentEnd = strpos($content, '>');
$cha = $currentEnd - $currentBegin - 1;
$tmpStringBegin = @substr($content, 0, $currentBegin);
// $tmpStringMiddle = @ substr($content, $currentBegin + 1, $cha);
$tmpStringEnd = @substr($content, $currentEnd + 1, strlen($content));
// $content = $tmpStringBegin.$tmpStringMiddle.$tmpStringEnd;
$content = $tmpStringBegin.$tmpStringEnd;
}
return $content;
}
private function contentGet(){
$fd = fopen($this->filename, 'r');
$content = fread($fd, filesize($this->filename));
fclose($fd);
return $content;
}
}
$deleteHtml = new deleteHtmlTags();
$content = $deleteHtml->deletehtmltags();
echo $content;
?>
修改部分也在上面,只是注釋掉了。個人覺得這個方法比用正則這類的方法更好。