非英文網站如何使用MySQL的字符集 對於非英文網站,當他們使用非英語語言從數據庫中寫入或讀取數據時,常常必須解決字符集的問題。字符集指導數據庫哪種字符編碼方案用於數據的寫入讀取,這樣可以簡單地理解為字符集的一個子集整理,它告訴數據庫如何存儲數據。 今天我們談論的是使用MySQL的字符集。在MySQL環境中,我們想存儲中文、日文等除了英文外其它的語言,這個時候我們就要將字符集應用到數據庫、表和列中。當我們連接MySQL數據庫時同樣也需要字符集,應該為連接設置字符集。現在,我總結了一些命令用於查看我們使用的數據的字符集以及根據需要如何改變字符集。在命令提示符窗口,首先我們需要使用 “mysql -u [name] -p” 登錄mysql客戶端。 接下來,我們想檢查數據端和服務的一些有關於字符集的變量,例如:連接字符集。我們輸入如下命令: show variables like 'char%'; show variables like 'collation%'; 執行命令後會出現如下信息提示: +--------------------------+---------------------------------------------------------+ | Variable_name | Value | +--------------------------+---------------------------------------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | C:\Program Files\MySQL\MySQL Server 5.1\share\charsets\ | +--------------------------+---------------------------------------------------------+ 對於我們的數據庫引擎所使用的字符集便一目了然。我們可以令改變這些變量使用如下命令: SET variable_name=value /* SET character_set_connection=utf8; */ 進入到我們設置的字符集環境,運行: SHOW CREATE DATABASE database_name 在輸出中我們可以找到如上注釋處默認的字符集。如果想改變數據庫的字符集,我們執行: ALTER DATABASE database_name CHARACTER SET charset_name COLLATE collation_name 當我們創建新的數據庫時也可以設置字符集,命令: CREATE DATABASE database_name CHARACTER SET charset_name COLLATE collation_name 對於數據庫的表, 命令相似的, 執行: SHOW CREATE TABLE table_name 在輸出的最後面,可以找到“DEFAULT CHARSET or COLLATE”,如果我們想改變這些,執行: ALTER TABLE table_name CONVERT TO CHARACTER SET charset_name COLLATE collation_name 當我們創建新的表時也可以設置字符集,命令: CREATE TABLE table_name (column_list) CHARACTER SET charset_name COLLATE collation_name 針對列, 需要執行: SHOW FULL COLUMNS IN table_name 第三列是 collation. 需要如下方法改變: ALTER TABLE table_name MODIFY col_name data_type CHARACTER SET charset_name COLLATE collation_name 通過學習以上命令, 你能夠掌握MySQL字符集和collation. 如果你使用編程語言連接MySQL用於存入和讀取數據,你也需要關聯語言中設置字符集編碼方案如PHP。 小貼士:如果你在MySQL中存儲中文或是其它非英文數據,有時候你會在命令控制台中發現如上陳列的問題。你可以嘗試導出外部sql文件並用文本編輯軟件打開,你會驚奇發現你的中文數據再現。 這意味著你的數據存儲正確,但是命令控制台中卻無法正確顯示。 譯者注:我也遇到過“小貼士”中最後一點提到的情況。我的MySQL是5.1版,起先我在Console中使用的是UTF8字符集,表中顯示的字符時中文亂碼(我的表級約束是UTF8字符集),我使用 charset gbk; 命令後任然是亂碼。再次使用 charset gbk; 命令,發現能正確顯示中文。但是在MySQL5.0版中卻無法用上述方法實現中文正確顯示。 Work with MySQL character set and collation Source : Peter For non-English websites, they often have to deal with character set and collation if they want to store data to and read data from databases with other languages. Character set tells the database which kind of character encoding scheme to use to store or read data, collation can be simply understood as a subset of character set, it tells the database how to sort data. We talk about working with character set and collation of MySQL today. In MySQL, if we want to store Chinese, Japanese or other languages other than English, we may need to set the relative character set for the database, tables and columns. Also, when we connect to MySQL. we may need to set the character set for the connection. Now I summarize some commands used to see what are the character set and collation of our database and how to change them as needed. On command prompt window, we need to log in to the mysql client with the mysql -u [username] -p command first. Now we may want to check some variables about character set and collation for our database client and server, for example, connection character set. We can type following commands: SHOW VARIABLES LIKE 'char%'; SHOW VARIABLES LIKE 'collation%'; The command will give us some information like +--------------------------+---------------------------------------------------------+ | Variable_name | Value | +--------------------------+---------------------------------------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | C:\Program Files\MySQL\MySQL Server 5.1\share\charsets\ | +--------------------------+---------------------------------------------------------+ We can easily understand that the character set we are using for the database engine. also we can change these variables by using SET variable_name=value /* SET character_set_connection=utf8; */ Next come to the database character set and collation, we run SHOW CREATE DATABASE database_name We can find our default character set in the comment of the output. If we want to change the character set and collation of the database, we run ALTER DATABASE database_name CHARACTER SET charset_name COLLATE collation_name We can also set the character set and collation when we create the new database CREATE DATABASE database_name CHARACTER SET charset_name COLLATE collation_name For database tables, the commands are similar, we run SHOW CREATE TABLE table_name At the end of the output, we may find the DEFAULT CHARSET or COLLATE, if we want to change them, we run ALTER TABLE table_name CONVERT TO CHARACTER SET charset_name COLLATE collation_name we can also set the character set and collation when we create a table, we run CREATE TABLE table_name (column_list) CHARACTER SET charset_name COLLATE collation_name For columns, we need to run SHOW FULL COLUMNS IN table_name the third column is the collation. We can change them with ALTER TABLE table_name MODIFY col_name data_type CHARACTER SET charset_name COLLATE collation_name By knowing all the commands above, you may be able to handle MySQL character set and collation. If you use programming languages to connect to MySQL to store and read data, you may also need to set the character encoding scheme in relative languages such as PHP. Finally one tip for you: If you store Chinese or other non-English data in MySQL database, sometimes you may find they are displayed as question marks in the command console. You can have a try to export the data to an external sql file and open the sql file with a text editor, you may be surprised that you can see your Chinese again. This means your data are stored properly but somehow the command console cannot display them correctly.