程式師世界 >> 編程語言 >> 網頁編程 >> PHP編程 >> 關於PHP編程 >> PHP解析XML的一些方法

PHP解析XML的一些方法

編輯：關於PHP編程

首先要說下編碼問題，如果XML文件與頁面文件編碼不一致，那麼亂碼會產生。解決中文亂碼問題可以輸出時用以下語句：echo iconv("UTF-8","GBK",$Song_Url);

PHP網頁的編碼

php文件本身的編碼與網頁的編碼應匹配，如果欲使用gb2312編碼，那麼php要輸出頭：header("Content-Type: text/html; charset=gb2312")，靜態頁面添加<meta http-equiv="Content-Type" content="text/html; charset=gb2312">，所有文件的編碼格式為ANSI，可用記事本打開，另存為選擇編碼為ANSI，覆蓋源文件。

如果想使用utf-8編碼，那麼php要輸出頭：header("Content-Type: text/html; charset=utf-8")，靜態頁面添加<meta http-equiv="Content-Type" content="text/html; charset=utf-8">，所有文件的編碼格式為utf-8。保存為utf-8可能會有點麻煩，一般utf-8文件開頭會有BOM，如果使用 session就會出問題，可用editplus來保存，在editplus中，工具->參數選擇->文件->UTF-8簽名，選擇總是刪除，再保存就可以去掉BOM信息了。

php本身不是Unicode的，所有substr之類的函數得改成mb_substr（需要裝mbstring擴展）；或者用iconv轉碼。

PHP與Mysql的數據交互PHP與數據庫的編碼應一致

修改mysql配置文件my.ini或my.cnf，mysql最好用utf8編碼[mysql]

default-character-set=utf8
[mysqld]
default-character-set=utf8
default-storage-engine=MyISAM
在[mysqld]下加入:
default-collation=utf8_bin
init_connect='SET NAMES utf8'

在需要做數據庫操作的php程序前加mysql_query("set names '編碼'");，編碼和php編碼一致，如果php編碼是gb2312那mysql編碼就是gb2312，如果是utf-8那mysql 編碼就是utf8，這樣插入或檢索數據時就不會出現亂碼了。

PHP與操作系統相關

Windows和Linux的編碼是不一樣的，在Windows環境下，調用PHP的函數時參數如果是utf-8編碼會出現錯誤，比如 move_uploaded_file()、filesize()、readfile()等，這些函數在處理上傳、下載時經常會用到，調用時可能會出現下面的錯誤:

Warning: move_uploaded_file()[function.move-uploaded-file]:failed to open stream: Invalid argument in ...
Warning: move_uploaded_file()[function.move-uploaded-file]:Unable to move '' to '' in ...
Warning: filesize() [function.filesize]: stat failed for ... in ...
Warning: readfile() [function.readfile]: failed to open stream: Invalid argument in ..

在Linux環境下用gb2312編碼雖然不會出現這些錯誤，但保存後的文件名出現亂碼導致無法讀取文件，這時可先將參數轉換成操作系統識別的編碼，編碼轉換可用mb_convert_encoding(字符串,新編碼,原編碼)或iconv(原編碼,新編碼,字符串)，這樣處理後保存的文件名就不會出現亂碼，也可以正常讀取文件，實現中文名稱文件的上傳、下載。

其實還有更好的解決方法，徹底與系統脫離，也就不用考慮系統是何編碼。可以生成一個只有字母和數字的序列作為文件名，而將原來帶有中文的名字保存在數據庫中，這樣調用move_uploaded_file()就不會出現問題，下載的時候只需將文件名改為原來帶有中文的名字。實現下載的代碼如下：

header("Pragma: public");
header("Expires: 0");
header("Cache-Component: must-revalidate, post-check=0, pre-check=0");
header("Content-type: $file_type");
header("Content-Length: $file_size");
header("Content-Disposition: attachment; filename="$file_name"");
header("Content-Transfer-Encoding: binary");
readfile($file_path);

$file_type是文件的類型，$file_name是原來的名字，$file_path是保存在服務上文件的地址。

book.xml

<books>
	<book>
		<author>Jack Herrington</author>
		<title>PHP Hacks</title>
		<publisher>O'Reilly</publisher>
	</book>
	<book>
		<author>Jack Herrington</author>
		<title>Podcasting Hacks</title>
		<publisher>O'Reilly</publisher>
	</book>
</books>

使用 DOM 庫讀取 XML：

<?php
$doc = new DOMDocument();
$doc->load( 'books.xml' );
$books = $doc->getElementsByTagName( "book" );
foreach( $books as $book )
{
	$authors = $book->getElementsByTagName( "author" );
	$author = $authors->item(0)->nodeValue;
	$publishers = $book->getElementsByTagName( "publisher" );
	$publisher = $publishers->item(0)->nodeValue;
	$titles = $book->getElementsByTagName( "title" );
	$title = $titles->item(0)->nodeValue;
	echo "$title - $author - $publishern";
}
?>

用 SAX 解析器讀取 XML：

<?php
$g_books = array();
$g_elem = null;
function startElement( $parser, $name, $attrs ) 
{
global $g_books, $g_elem;
if ( $name == 'BOOK' ) $g_books []= array();
$g_elem = $name;
}
function endElement( $parser, $name ) 
{
global $g_elem;
$g_elem = null;
}
function textData( $parser, $text )
{
global $g_books, $g_elem;
if ( $g_elem == 'AUTHOR' ||
$g_elem == 'PUBLISHER' ||
$g_elem == 'TITLE' )
{
$g_books[ count( $g_books ) - 1 ][ $g_elem ] = $text;
}
}
$parser = xml_parser_create();
xml_set_element_handler( $parser, "startElement", "endElement" );
xml_set_character_data_handler( $parser, "textData" );
$f = fopen( 'books.xml', 'r' );
while( $data = fread( $f, 4096 ) )
{
xml_parse( $parser, $data );
}
xml_parser_free( $parser );
foreach( $g_books as $book )
{
echo $book['TITLE']." - ".$book['AUTHOR']." - ";
echo $book['PUBLISHER']."n";
}
?>

用正則表達式解析 XML：

<?php
$xml = "";
$f = fopen( 'books.xml', 'r' );
while( $data = fread( $f, 4096 ) ) { $xml .= $data; }
fclose( $f );
preg_match_all( "/<book>(.*?)</book>/s", 
$xml, $bookblocks );
foreach( $bookblocks[1] as $block )
{
preg_match_all( "/<author>(.*?)</author>/", 
$block, $author );
preg_match_all( "/<title>(.*?)</title>/", 
$block, $title );
preg_match_all( "/<publisher>(.*?)</publisher>/", 
$block, $publisher );
echo( $title[1][0]." - ".$author[1][0]." - ".
$publisher[1][0]."n" );
}
?>