A2-05-01.MySQL Character Set

转载自:http://www.mysqltutorial.org/mysql-character-set/

MySQL Character Set

 

Summary: in this tutorial, you will learn about MySQL character set. After the tutorial, you will know how to get all character sets in MySQL, how to convert strings between character sets, and how to configure proper character sets for client connections.

Introduction to MySQL character set

A MySQL character set is a set of characters that are legal in a string. For example, we have an alphabet with letters from a  to z.We assign each letter a number, for example,  a = 1b = 2 etc. The letter  a is a symbol, and the number 1  that associates with the letter a  is the encoding. The combination of all letters from a to z and their corresponding encodings is a character set.

Each character set has one or more collations that define a set of rules for comparing characters within the character set. Check it out the MySQL collation tutorial to learn about the collations in MySQL.

MySQL supports various character sets that allow you to store almost every character in a string. To get all available character sets in MySQL database server, you use the SHOW CHARACTER SET  statement as follows:

mysql character set

The default character set in MySQL is latin1. If you want to store characters from multiple languages in a single column, you can use Unicode character sets, which is utf8 or ucs2.

The values in the Maxlen column specify the number of bytes that a character in a character set holds. Some character sets contain single-byte characters e.g., latin1 , latin2 , cp850 , etc., whereas other character sets contain multi-byte characters.

MySQL provides the LENGTH function to get a length of a string in bytes, and the CHAR_LENGTH function to get the length of a string in characters. If a string contains the multi-bytes character, the result of the LENGTH function is greater than the result of the CHAR_LENGTH() function. See the following example:

mysql convert character set

The CONVERT function converts a string into a specific character set. In this example, it converts the character set of the MySQL Character Set  string into ucs2 . Because ucs2 character set contains 2-byte characters, therefore the length of the @str  string in bytes is greater than its length in characters.

Notice that some character sets contain multi-byte characters,  but their strings may contain only single-byte characters e.g., utf8  as shown in the following statements:

single-byte character set

However, if a utf8 string contains special character e.g., ü  in the pingüino string; its length in bytes is different, see the following example:

 

unicode character set

Converting between different character sets

MySQL provides two functions that allow you to convert strings between different character sets: CONVERT and CAST. We have used the CONVERT function several times in the above examples.

The syntax of the CONVERT function is as follows:

The CAST function is similar to the CONVERT function. It converts a string to a different character set:

Take a look at the following example of using the CAST  function:

 

Setting character sets for client connections

When an application exchanges data with a MySQL database server, the default character set is latin1. However, if the database stores Unicode strings in the utf8 character set, using the latin1 character set in the application would not be sufficient. Therefore, the application needs to specify a proper character set when it connects to MySQL database server.

To configure a character set for a client connection, you can do one of the following ways:

  • Issue the SET NAME  statement after the client connected to the MySQL database server. For example, to set a Unicode character set utf8, you use the following statement:

 

 

  • If the application supports the --default-character-set  option, you can use it to set the character set. For example, mysql client tool supports --default-character-set  and you can set it up in the configuration file as follows:

 

 

  • Some MySQL connectors allow you to set character set, for example, if you use PHP PDO, you can set the character set in the data source name as follows:

 

Regardless of which way you use, make sure that the character set used by the application matches with the character set stored in the MySQL database server.

In this tutorial, you have learned about MySQL character set, how to convert strings between character sets and how to configure proper character sets for client connections.

posted @ 2018-08-24 18:31  zhuntidaoren  阅读(100)  评论(0编辑  收藏  举报