﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>博客园-…蠢蛋进化论≯ </title><link>http://www.cnblogs.com/Alacky/</link><description>不要让自己感到时间充裕,否则只有到死的时候才会发现生命的短暂.</description><language>zh-cn</language><lastBuildDate>Sat, 11 Oct 2008 16:16:05 GMT</lastBuildDate><pubDate>Sat, 11 Oct 2008 16:16:05 GMT</pubDate><ttl>60</ttl><item><title>Dead End！！</title><link>http://www.cnblogs.com/Alacky/archive/2008/09/24/1298314.html</link><dc:creator>*Alacky</dc:creator><author>*Alacky</author><pubDate>Wed, 24 Sep 2008 12:37:00 GMT</pubDate><guid>http://www.cnblogs.com/Alacky/archive/2008/09/24/1298314.html</guid><wfw:comment>http://www.cnblogs.com/Alacky/comments/1298314.html</wfw:comment><comments>http://www.cnblogs.com/Alacky/archive/2008/09/24/1298314.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnblogs.com/Alacky/comments/commentRss/1298314.html</wfw:commentRss><trackback:ping>http://www.cnblogs.com/Alacky/services/trackbacks/1298314.html</trackback:ping><description><![CDATA[&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;凭着自己所谓的信念一路飘到了这里，才发现已经失去错过了太多太多，固执的耳朵从未听进去路人的呼喊，直到碰壁：<img src="http://www.cnblogs.com/images/cnblogs_com/Alacky/108641/o_dead-end-sign_ie.jpg" border="0"  alt="" />
<img src ="http://www.cnblogs.com/Alacky/aggbug/1298314.html?type=1" width = "1" height = "1" /><br><br><a href="http://news.cnblogs.com/n/42945/" target="_blank">[新闻]Google股价跌破329美元 61%员工期权价值归零</a><br/><a href="http://www.cnblogs.com" target="_blank">博客园首页</a>&nbsp;<a href="http://space.cnblogs.com" target="_blank">社区</a>&nbsp;<a href="http://news.cnblogs.com" target="_blank">新闻频道</a>&nbsp;<a href="http://space.cnblogs.com/group.htm" target="_blank">小组</a>&nbsp;<a href="http://space.cnblogs.com/q" target="_blank">博问</a>&nbsp;<a href="http://wz.cnblogs.com/" target="_blank">网摘</a>&nbsp;<a href="http://space.cnblogs.com/ing" target="_blank">闪存</a>]]></description></item><item><title>[转]An Implementation of Double-Array Trie</title><link>http://www.cnblogs.com/Alacky/articles/1224005.html</link><dc:creator>*Alacky</dc:creator><author>*Alacky</author><pubDate>Tue, 17 Jun 2008 07:12:00 GMT</pubDate><guid>http://www.cnblogs.com/Alacky/articles/1224005.html</guid><wfw:comment>http://www.cnblogs.com/Alacky/comments/1224005.html</wfw:comment><comments>http://www.cnblogs.com/Alacky/articles/1224005.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnblogs.com/Alacky/comments/commentRss/1224005.html</wfw:commentRss><trackback:ping>http://www.cnblogs.com/Alacky/services/trackbacks/1224005.html</trackback:ping><description><![CDATA[<span style="color: red"><strong>转自：</strong><font face="Verdana"><strong>http://linux.thai.net/~thep/datrie/datrie.html<br />
</strong><span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 一种高效的词库构造方法，最近的项目中用到的。等</span></font></span>有空了会翻译一下<img src="http://www.cnblogs.com/Emoticons/xd/003.gif"  alt="" />。
<hr />
<h2>Contents</h2>
<ol>
    <li>What is Trie?
    <li>What Does It Take to Implement a Trie?
    <li>Tripple-Array Trie
    <li>Double-Array Trie
    <li>Suffix Compression
    <li>Key Insertion
    <li>Key Deletion
    <li>Double-Array Pool Allocation
    <li>An Implementation
    <li>Download
    <li>References </li>
</ol>
<a name="What">
<h2>What is Trie?</h2>
</a>
<p><strong>Trie</strong> is a kind of digital search tree. (See <a href="http://linux.thai.net/~thep/datrie/datrie.html#Ref_Knuth1972">[Knuth1972]</a> for the detail of digital search tree.) <a href="http://linux.thai.net/~thep/datrie/datrie.html#Ref_Fredkin1960">[Fredkin1960]</a> introduced the <cite>trie</cite> terminology, which is abbreviated from "Re<em>trie</em>val".</p>
<p><img alt="Trie Example" src="http://linux.thai.net/~thep/datrie/trie1.gif" /></p>
<p>Trie is an efficient indexing method. It is indeed also a kind of deterministic finite automaton (DFA) (See <a href="http://linux.thai.net/~thep/datrie/datrie.html#Ref_Cohen">[Cohen1990]</a>, for example, for the definition of DFA). Within the tree structure, each node corresponds to a DFA state, each (directed) labeled edge from a parent node to a child node corresponds to a DFA transition. The traversal starts at the root node. Then, from head to tail, one by one character in the key string is taken to determine the next state to go. The edge labeled with the same character is chosen to walk. Notice that each step of such walking consumes one character from the key and descends one step down the tree. If the key is exhausted and a leaf node is reached, then we arrive at the exit for that key. If we get stuck at some node, either because there is no branch labeled with the current character we have or because the key is exhausted at an internal node, then it simply implies that the key is not recognized by the trie.</p>
<p>Notice that the time needed to traverse from the root to the leaf is not dependent on the size of the database, but is proportional to the length of the key. Therefore, it is usually much faster than B-tree or any comparison-based indexing method in general cases. Its time complexity is comparable with hashing techniques.</p>
<p>In addition to the efficiency, trie also provides flexibility in searching for the closest path in case that the key is misspelled. For example, by skipping a certain character in the key while walking, we can fix the insertion kind of typo. By walking toward all the immediate children of one node without consuming a character from the key, we can fix the deletion typo, or even substitution typo if we just drop the key character that has no branch to go and descend to all the immediate children of the current node.</p>
<a name="WhatTake">
<h2>What Does It Take to Implement a Trie?</h2>
</a>
<p>In general, a DFA is represented with a <cite>transition table</cite>, in which the rows correspond to the states, and the columns correspond to the transition labels. The data kept in each cell is then the next state to go for a given state when the input is equal to the label.</p>
<p>This is an efficient method for the traversal, because every transition can be calculated by two-dimensional array indexing. However, in term of space usage, this is rather extravagant, because, in the case of trie, most nodes have only a few branches, leaving the majority of the table cells blanks.</p>
<p>Meanwhile, a more compact scheme is to use a linked list to store the transitions out of each state. But this results in slower access, due to the linear search.</p>
<p>Hence, table compression techniques which still allows fast access have been devised to solve the problem.</p>
<ol>
    <li><strong><a href="http://linux.thai.net/~thep/datrie/datrie.html#Ref_Johnson1975">[Johnson1975]</a></strong> (Also explained in <a href="http://linux.thai.net/~thep/datrie/datrie.html#Ref_Aho+1985">[Aho+1985]</a> pp. 144-146) represented DFA with four arrays, which can be simplified to three in case of trie. The transition table rows are allocated in overlapping manner, allowing the free cells to be used by other rows.
    <li><strong><a href="http://linux.thai.net/~thep/datrie/datrie.html#Ref_Aoe1989">[Aoe1989]</a></strong> proposed an improvement from the three-array structure by reducing the arrays to two. </li>
</ol>
<a name="Tripple">
<h2>Tripple-Array Trie</h2>
</a>
<p>As explained in <a href="http://linux.thai.net/~thep/datrie/datrie.html#Ref_Aho+1985">[Aho+1985]</a> pp. 144-146, a DFA compression could be done using four linear arrays, namely <cite>default</cite>, <cite>base</cite>, <cite>next</cite>, and <cite>check</cite>. However, in a case simpler than the lexical analyzer, such as the mere trie for information retrieval, the <cite>default</cite> array could be omitted. Thus, a trie can be implemented using three arrays according to this scheme.</p>
<h3>Structure</h3>
<p>The tripple-array structure is composed of:</p>
<ol>
    <li><strong><em>base</em></strong>. Each element in <cite>base</cite> corresponds to a node of the trie. For a trie node <cite>s</cite>, <cite>base</cite>[<cite>s</cite>] is the starting index within the <cite>next</cite> and <cite>check</cite> pool (to be explained later) for the row of the node <cite>s</cite> in the transition table.
    <li><strong><em>next</em></strong>. This array, in coordination with <cite>check</cite>, provides a pool for the allocation of the sparse vectors for the rows in the trie transition table. The vector data, that is, the vector of transitions from every node, would be stored in this array.
    <li><strong><em>check</em></strong>. This array works in parallel to <cite>next</cite>. It marks the owner of every cell in <cite>next</cite>. This allows the cells next to one another to be allocated to different trie nodes. That means the sparse vectors of transitions from more than one node are allowed to be overlapped. </li>
</ol>
<div class="definition">
<p><strong>Definition 1.</strong> For a transition from state <cite>s</cite> to <cite>t</cite> which takes character <cite>c</cite> as the input, the condition maintained in the tripple-array trie is:</p>
<blockquote><cite>check</cite>[<cite>base</cite>[<cite>s</cite>] + <cite>c</cite>] = <cite>s</cite><br />
<cite>next</cite>[<cite>base</cite>[<cite>s</cite>] + <cite>c</cite>] = <cite>t</cite> </blockquote></div>
<p><img alt="Tripple-Array Structure" src="http://linux.thai.net/~thep/datrie/tripple.gif" /></p>
<h3>Walking</h3>
<p>According to <strong>definition 1</strong>, the walking algorithm for a given state <cite>s</cite> and the input character <cite>c</cite> is:</p>
<div class="pseudocode"><cite>t</cite> := <cite>base</cite>[<cite>s</cite>] + <cite>c</cite>;<br />
<strong><em>if</em></strong> <cite>check</cite>[<cite>t</cite>] = <cite>s</cite> <strong><em>then</em></strong> <cite>next state</cite> := <cite>next</cite>[<cite>t</cite>] <strong><em>else</em></strong> <cite>fail</cite> <strong><em>endif</em></strong> </div>
<h3>Construction</h3>
<p>To insert a transition that takes character <cite>c</cite> to traverse from a state <cite>s</cite> to another state <cite>t</cite>, the cell <cite>next</cite>[<cite>base</cite>[<cite>s</cite>] + <cite>c</cite>]] must be managed to be available. If it is already vacant, we are lucky. Otherwise, either the entire transition vector for the current owner of the cell or that of the state <cite>s</cite> itself must be relocated. The estimated cost for each case could determine which one to move. After finding the free slots to place the vector, the transition vector must be recalculated as follows. Assuming the new place begins at <cite>b</cite>, the procedure for the relocation is:</p>
<div class="pseudocode"><strong><em>Procedure</em></strong> <cite>Relocate</cite>(<cite>s</cite> : <strong><em>state</em></strong>; <cite>b</cite> : <strong><em>base_index</em></strong>) <span class="comment">{ Move base for state <cite>s</cite> to a new place beginning at <cite>b</cite> }</span> <strong><em>begin</em></strong> <strong><em>foreach</em></strong> input character <cite>c</cite> for the state <cite>s</cite> <span class="comment">{ i.e. foreach <cite>c</cite> such that <cite>check</cite>[<cite>base</cite>[<cite>s</cite>] + <cite>c</cite>]] = <cite>s</cite> }</span> <strong><em>begin</em></strong> <cite>check</cite>[<cite>b</cite> + <cite>c</cite>] := <cite>s</cite>; <span class="comment">{ mark owner }</span> <cite>next</cite>[<cite>b</cite> + <cite>c</cite>] := <cite>next</cite>[<cite>base</cite>[<cite>s</cite>] + <cite>c</cite>]; <span class="comment">{ copy data }</span> <cite>check</cite>[<cite>base</cite>[<cite>s</cite>] + <cite>c</cite>] := <strong><em>none</em></strong> <span class="comment">{ free the cell }</span> <strong><em>end</em></strong>; <cite>base</cite>[<cite>s</cite>] := <cite>b</cite> <strong><em>end</em></strong> </div>
<p><img alt="Tripple-Array Relocation" src="http://linux.thai.net/~thep/datrie/tripreloc.gif" /></p>
<a name="Double">
<h2>Double-Array Trie</h2>
</a>
<p>The tripple-array structure for implementing trie appears to be well defined, but is still not practical to keep in a single file. The <strong><em>next/check</em></strong> pool may be able to keep in a single array of integer couples, but the <strong><em>base</em></strong> array does not grow in parallel to the pool, and is therefore usually split.</p>
<p>To solve this problem, <a href="http://linux.thai.net/~thep/datrie/Ref_Aoe1989">[Aoe1989]</a> reduced the structure into two parallel arrays. In the double-array structure, the <strong><em>base</em></strong> and <strong><em>next</em></strong> are merged, resulting in only two parallel arrays, namely, <strong><em>base</em></strong> and <strong><em>check</em></strong>.</p>
<h3>Structure</h3>
<p>Instead of indirectly referencing through <cite>state numbers</cite> as in tripple-array trie, nodes in double-array trie are linked directly within the <strong><em>base/check</em></strong> pool.</p>
<div class="definition">
<p><strong>Definition 2.</strong> For a transition from state <cite>s</cite> to <cite>t</cite> which takes character <cite>c</cite> as the input, the condition maintained in the double-array trie is:</p>
<blockquote><cite>check</cite>[<cite>base</cite>[<cite>s</cite>] + <cite>c</cite>] = <cite>s</cite><br />
<cite>base</cite>[<cite>s</cite>] + <cite>c</cite> = <cite>t</cite> </blockquote></div>
<p><img alt="Double-Array Structure" src="http://linux.thai.net/~thep/datrie/double.gif" /></p>
<h3>Walking</h3>
<p>According to <strong>definition 2</strong>, the walking algorithm for a given state <cite>s</cite> and the input character <cite>c</cite> is:</p>
<div class="pseudocode"><cite>t</cite> := <cite>base</cite>[<cite>s</cite>] + <cite>c</cite>;<br />
<strong><em>if</em></strong> <cite>check</cite>[<cite>t</cite>] = <cite>s</cite> <strong><em>then</em></strong> <cite>next state</cite> := <cite>t</cite> <strong><em>else</em></strong> <cite>fail</cite> <strong><em>endif</em></strong> </div>
<h3>Construction</h3>
<p>The construction of double-array trie is in principle the same as that of tripple-array trie. The difference is the base relocation:</p>
<div class="pseudocode"><strong><em>Procedure</em></strong> <cite>Relocate</cite>(<cite>s</cite> : <strong><em>state</em></strong>; <cite>b</cite> : <strong><em>base_index</em></strong>) <span class="comment">{ Move base for state <cite>s</cite> to a new place beginning at <cite>b</cite> }</span> <strong><em>begin</em></strong> <strong><em>foreach</em></strong> input character <cite>c</cite> for the state <cite>s</cite> <span class="comment">{ i.e. foreach <cite>c</cite> such that <cite>check</cite>[<cite>base</cite>[<cite>s</cite>] + <cite>c</cite>]] = <cite>s</cite> }</span> <strong><em>begin</em></strong> <cite>check</cite>[<cite>b</cite> + <cite>c</cite>] := <cite>s</cite>; <span class="comment">{ mark owner }</span> <cite>base</cite>[<cite>b</cite> + <cite>c</cite>] := <cite>base</cite>[<cite>base</cite>[<cite>s</cite>] + <cite>c</cite>]; <span class="comment">{ copy data }</span> <span class="comment">{ the node <cite>base</cite>[<cite>s</cite>] + <cite>c</cite> is to be moved to <cite>b</cite> + <cite>c</cite>; Hence, for any <cite>i</cite> for which <cite>check</cite>[<cite>i</cite>] = <cite>base</cite>[<cite>s</cite>] + <cite>c</cite>, update <cite>check</cite>[<cite>i</cite>] to <cite>b</cite> + <cite>c</cite> }</span> <strong><em>foreach</em></strong> input character <cite>d</cite> for the node <cite>base</cite>[<cite>s</cite>] + <cite>c</cite> <strong><em>begin</em></strong> <cite>check</cite>[<cite>base</cite>[<cite>base</cite>[<cite>s</cite>] + <cite>c</cite>] + <cite>d</cite>] := <cite>b</cite> + <cite>c</cite> <strong><em>end</em></strong>; <cite>check</cite>[<cite>base</cite>[<cite>s</cite>] + <cite>c</cite>] := <strong><em>none</em></strong> <span class="comment">{ free the cell }</span> <strong><em>end</em></strong>; <cite>base</cite>[<cite>s</cite>] := <cite>b</cite> <strong><em>end</em></strong> </div>
<p><img alt="Double-Array Relocation" src="http://linux.thai.net/~thep/datrie/doubreloc.gif" /></p>
<a name="Suffix">
<h2>Suffix Compression</h2>
</a>
<p><a href="http://linux.thai.net/~thep/datrie/Ref_Aoe1989">[Aoe1989]</a> also suggested a storage compression strategy, by splitting non-branching suffixes into single string storages, called <strong><em>tail</em></strong>, so that the rest non-branching steps are reduced into mere string comparison.</p>
<p>With the two separate data structures, double-array branches and suffix-spool tail, key insertion and deletion algorithms must be modified accordingly.</p>
<a name="Insert">
<h2>Key Insertion</h2>
</a>
<p>To insert a new key, the branching position can be found by traversing the trie with the key one by one character until it gets stuck. The state where there is no branch to go is the very place to insert a new edge, labeled by the failing character. However, with the branch-tail structure, the insertion point can be either in the branch or in the tail.</p>
<h3>1. When the branching point is in the double-array structure</h3>
<p>Suppose that the new key is a string a<sub>1</sub>a<sub>2</sub>...a<sub>h-1</sub>a<sub>h</sub>a<sub>h+1</sub>...a<sub>n</sub>, where a<sub>1</sub>a<sub>2</sub>...a<sub>h-1</sub> traverses the trie from the root to a node s<sub>r</sub> in the double-array structure, and there is no edge labeled a<sub>h</sub> that goes out of s<sub>r</sub>. The algorithm called <strong><em>A_INSERT</em></strong> in <a href="http://linux.thai.net/~thep/datrie/datrie.html#Ref_Aoe1989">[Aoe1989]</a> does as follows: </p>
<div class="pseudocode">From s<sub>r</sub>, insert edge labeled a<sub>h</sub> to new node s<sub>t</sub>; Let s<sub>t</sub> be a separate node poining to a string a<sub>h+1</sub>...a<sub>n</sub> in tail pool. </div>
<p><img alt="A_INSERT algorithm" src="http://linux.thai.net/~thep/datrie/a_insert.gif" /></p>
<h3>2. When the branching point is in the tail pool</h3>
<p>Since the path through a tail string has no branch, and therefore corresponds to exactly one key, suppose that the key corresponding to the tail is</p>
<p>a<sub>1</sub>a<sub>2</sub>...a<sub>h-1</sub>a<sub>h</sub>...a<sub>h+k-1</sub>b<sub>1</sub>...b<sub>m</sub>,<br />
<br />
where a<sub>1</sub>a<sub>2</sub>...a<sub>h-1</sub> is in double-array structure, and a<sub>h</sub>...a<sub>h+k-1</sub>b<sub>1</sub>...b<sub>m</sub> is in tail. Suppose that the substring a<sub>1</sub>a<sub>2</sub>...a<sub>h-1</sub> traverses the trie from the root to a node s<sub>r</sub>. </p>
<p>And suppose that the new key is in the form</p>
<p>a<sub>1</sub>a<sub>2</sub>...a<sub>h-1</sub>a<sub>h</sub>...a<sub>h+k-1</sub>a<sub>h+k</sub>...a<sub>n</sub>,<br />
<br />
where a<sub>h+k</sub> &lt;&gt; b<sub>1</sub>. The algorithm called <strong><em>B_INSERT</em></strong> in <a href="http://linux.thai.net/~thep/datrie/datrie.html#Ref_Aoe1989">[Aoe1989]</a> does as follows: </p>
<div class="pseudocode">From s<sub>r</sub>, insert straight path with a<sub>h</sub>...a<sub>h+k-1</sub>, ending at a new node s<sub>t</sub>; From s<sub>t</sub>, insert edge labeled b<sub>1</sub> to new node s<sub>u</sub>; Let s<sub>u</sub> be separate node pointing to a string b<sub>2</sub>...b<sub>m</sub> in tail pool; From s<sub>t</sub>, insert edge labeled a<sub>h+k</sub> to new node s<sub>v</sub>; Let s<sub>v</sub> be separate node pointing to a string a<sub>h+k+1</sub>...a<sub>n</sub> in tail pool. </div>
<p><img alt="B_INSERT algorithm" src="http://linux.thai.net/~thep/datrie/b_insert.gif" /></p>
<a name="Delete">
<h2>Key Deletion</h2>
</a>
<p>To delete a key from the trie, all we need to do is delete the tail block occupied by the key, and all double-array nodes belonging exclusively to the key, without touching any node belonging to other keys.</p>
<p>Consider a trie which accepts a language K = {pool#, prepare#, preview#, prize#, produce#, producer#, progress#} : </p>
<p><img alt="example trie" src="http://linux.thai.net/~thep/datrie/trie2.gif" /></p>
<p>The key "pool#" can be deleted by removing the tail string "ol#" from the tail pool, and node 3 from the double-array structure. This is the simplest case.</p>
<p>To remove the key "produce#", it is sufficient to delete node 14 from the double-array structure. But the resulting trie will not obay the convention that every node in the double-array structure, except the separate nodes which point to tail blocks, must belong to more than one key. The path from node 10 on will belong solely to the key "producer#".</p>
<p>But there is no harm violating this rule. The only drawback is the uncompactnesss of the trie. Traversal, insertion and deletion algoritms are intact. Therefore, this should be relaxed, for the sake of simplicity and efficiency of the deletion algorithm. Otherwise, there must be extra steps to examine other keys in the same subtree ("producer#" for the deletion of "produce#") if any node needs to be moved from the double-array structure to tail pool.</p>
<p>Suppose further that having removed "produce#" as such (by removing only node 14), we also need to remove "producer#" from the trie. What we have to do is remove string "#" from tail, and remove nodes 15, 13, 12, 11, 10 (which now belong solely to the key "producer#") from the double-array structure.</p>
<p>We can thus summarize the algorithm to delete a key k = a<sub>1</sub>a<sub>2</sub>...a<sub>h-1</sub>a<sub>h</sub>...a<sub>n</sub>, where a<sub>1</sub>a<sub>2</sub>...a<sub>h-1</sub> is in double-array structure, and a<sub>h</sub>...a<sub>n</sub> is in tail pool, as follows : </p>
<div class="pseudocode">Let <cite>s<sub>r</sub></cite> := the node reached by a<sub>1</sub>a<sub>2</sub>...a<sub>h-1</sub>; Delete a<sub>h</sub>...a<sub>n</sub> from tail; <cite>s</cite> := <cite>s<sub>r</sub></cite>; <strong><em>repeat</em></strong> <cite>p</cite> := parent of <cite>s</cite>; Delete node <cite>s</cite> from double-array structure; <cite>s</cite> := <cite>p</cite> <strong><em>until</em></strong> <cite>s</cite> = root <strong><em>or</em></strong> <cite>outdegree</cite>(<cite>s</cite>) &gt; 0. </div>
<p>Where <cite>outdegree</cite>(<cite>s</cite>) is the number of children nodes of <cite>s</cite>. <a name="Alloc">
<h2>Double-Array Pool Allocation</h2>
</a>
<p>When inserting a new branch for a node, it is possible that the array element for the new branch has already been allocated to another node. In that case, relocation is needed. The efficiency-critical part then turns out to be the search for a new place. A brute force algoritm iterates along the <cite>check</cite> array to find an empty cell to place the first branch, and then assure that there are empty cells for all other branches as well. The time used is therefore proportional to the size of the double-array pool and the size of the alphabet.</p>
<p>Suppose that there are <cite>n</cite> nodes in the trie, and the alphabet is of size <cite>m</cite>. The size of the double-array structure would be <cite>n</cite> + <cite>cm</cite>, where <cite>c</cite> is a coefficient which is dependent on the characteristic of the trie. And the time complexity of the brute force algorithm would be O(<cite>nm</cite> + <cite>cm<sup>2</sup></cite>).</p>
<p><a href="http://linux.thai.net/~thep/datrie/datrie.html#Ref_Aoe1989">[Aoe1989]</a> proposed a free-space list in the double-array structure to make the time complexity independent of the size of the trie, but dependent on the number of the free cells only. The <cite>check</cite> array for the free cells are redefined to keep a pointer to the next free cell (called G-link) : </p>
<div class="definition">
<p><strong>Definition 3.</strong> Let r<sub>1</sub>, r<sub>2</sub>, ... , r<sub>cm</sub> be the free cells in the double-array structure, ordered by position. G-link is defined as follows :
<blockquote><cite>check</cite>[0] = -r<sub>1</sub><br />
<cite>check</cite>[r<sub>i</sub>] = -r<sub>i+1</sub> ; 1 &lt;= i &lt;= cm-1<br />
<cite>check</cite>[r<sub>cm</sub>] = -1 </blockquote></div>
<p>By this definition, negative <cite>check</cite> means unoccupied in the same sense as that for "none" <cite>check</cite> in the ordinary algorithm. This encoding scheme forms a singly-linked list of free cells. When searching for an empty cell, only <cite>cm</cite> free cells are visited, instead of all <cite>n</cite> + <cite>cm</cite> cells as in the brute force algorithm.</p>
<p>This, however, can still be improved. Notice that for those cells with negative <cite>check</cite>, the corresponding <cite>base</cite>'s are not given any definition. Therefore, in our implementation, Aoe's G-link is modified to be doubly-linked list by letting <cite>base</cite> of every free cell points to a previous free cell. This can speed up the insertion and deletion processes. And, for convenience in referencing the list head and tail, we let the list be circular. The zeroth node is dedicated to be the entry point of the list. And the root node of the trie will begin with cell number one.</p>
<div class="definition">
<p><strong>Definition 4.</strong> Let r<sub>1</sub>, r<sub>2</sub>, ... , r<sub>cm</sub> be the free cells in the double-array structure, ordered by position. G-link is defined as follows :
<blockquote><cite>check</cite>[0] = -r<sub>1</sub><br />
<cite>check</cite>[r<sub>i</sub>] = -r<sub>i+1</sub> ; 1 &lt;= i &lt;= cm-1<br />
<cite>check</cite>[r<sub>cm</sub>] = 0<br />
<cite>base</cite>[0] = -r<sub>cm</sub><br />
<cite>base</cite>[r<sub>1</sub>] = 0<br />
<cite>base</cite>[r<sub>i+1</sub>] = -r<sub>i</sub> ; 1 &lt;= i &lt;= cm-1 </blockquote></div>
<p>Then, the searching for the slots for a node with input symbol set P = {c<sub>1</sub>, c<sub>2</sub>, ..., c<sub>p</sub>} needs to iterate only the cells with negative <cite>check</cite> : </p>
<div class="pseudocode"><span class="comment">{find least free cell s such that s &gt; c<sub>1</sub>}</span> s := -<cite>check</cite>[0]; <strong><em>while</em></strong> s &lt;&gt; 0 <strong><em>and</em></strong> s &lt;= c<sub>1</sub> <strong><em>do</em></strong> s := -<cite>check</cite>[s] <strong><em>end</em></strong>; <strong><em>if</em></strong> s = 0 <strong><em>then</em></strong> <strong><em>return</em></strong> <em>FAIL</em>; <span class="comment">{or reserve some additional space}</span> <span class="comment">{continue searching for the row, given that s matches c<sub>1</sub>}</span> <strong><em>while</em></strong> s &lt;&gt; 0 <strong><em>do</em></strong> i := 2; <strong><em>while</em></strong> i &lt;= p <strong><em>and</em></strong> <cite>check</cite>[s + c<sub>i</sub> - c<sub>1</sub>] &lt; 0 <strong><em>do</em></strong> i := i + 1 <strong><em>end</em></strong>; <strong><em>if</em></strong> i = p + 1 <strong><em>then</em></strong> <strong><em>return</em></strong> s - c<sub>1</sub>; <span class="comment">{all cells required are free, so return it}</span> s := -<cite>check</cite>[-s] <strong><em>end</em></strong>; <strong><em>return</em></strong> <cite>FAIL</cite>; <span class="comment">{or reserve some additional space}</span> </div>
<p>The time complexity for free slot searching is reduced to O(<cite>cm<sup>2</sup></cite>). The relocation stage takes O(<cite>m<sup>2</sup></cite>). The total time complexity is therefore O(<cite>cm<sup>2</sup></cite> + <cite>m<sup>2</sup></cite>) = O(<cite>cm<sup>2</sup></cite>).</p>
<p>It is useful to keep the free list ordered by position, so that the access through the array becomes more sequential. This would be beneficial when the trie is stored in a disk file or virtual memory, because the disk caching or page swapping would be used more efficiently. So, the free cell reusing should maintain this strategy :</p>
<div class="pseudocode">t := -<cite>check</cite>[0]; <strong><em>while</em></strong> <cite>check</cite>[t] &lt;&gt; 0 <strong><em>and</em></strong> t &lt; s <strong><em>do</em></strong> t := -<cite>check</cite>[t] <strong><em>end</em></strong>; <span class="comment">{t now points to the cell after s' place}</span> <cite>check</cite>[s] := -t; <cite>check</cite>[-<cite>base</cite>[t]] := -s; <cite>base</cite>[s] := <cite>base</cite>[t]; <cite>base</cite>[t] := -s; </div>
<p>Time complexity of freeing a cell is thus O(<cite>cm</cite>).</p>
<a name="AnImp">
<h2>An Implementation</h2>
</a>
<p>In my implementation, I exploit the concept of virtual memory to manage large and permenent trie. A base class called <strong>VirtualMem</strong> divides the data file into pages, and swaps them in as needed. Memory economy and disk caching is thus achieved. Different memory accessing methods are provided so that the page swapping mechanism is hidden from the class users. Meanwhile, byte order is internally managed on-the-fly inside the methods to achieve data portability.</p>
<p>Double-array structure and tail pool are then built upon the <strong>VirtualMem</strong> base class. <strong>Trie</strong> class is then created to contain the two data structures, with basic interfaces for trie manipulation.</p>
<a name="Download">
<h2>Download</h2>
</a>
<p><strong>Update:</strong> The double-array trie implementation has been simplified and rewritten from scratch in C, and is now named <tt>datrie</tt>. It is now available under the terms of <a href="http://www.gnu.org/licenses/lgpl.html">GNU Lesser General Public License (LGPL)</a>:</p>
<ul>
    <li><a href="ftp://linux.thai.net/pub/ThaiLinux/software/libthai/datrie-0.1.0.tar.gz">datrie-0.1.0</a> (18 September 2006) </li>
</ul>
<p>The old C++ source code below is under the terms of <a href="http://www.gnu.org/licenses/lgpl.html">GNU Lesser General Public License (LGPL)</a>:</p>
<ul>
    <li><a href="http://linux.thai.net/~thep/datrie/download/midatrie-0.3.4.tar.gz">midatrie-0.3.3</a> (2 October 2001)
    <li><a href="http://linux.thai.net/~thep/datrie/download/midatrie-0.3.3.tar.gz">midatrie-0.3.3</a> (16 July 2001)
    <li><a href="http://linux.thai.net/~thep/datrie/download/midatrie-0.3.2.tar.gz">midatrie-0.3.2</a> (21 May 2001)
    <li><a href="http://linux.thai.net/~thep/datrie/download/midatrie-0.3.1.tar.gz">midatrie-0.3.1</a> (8 May 2001)
    <li><a href="http://linux.thai.net/~thep/datrie/download/midatrie-0.3.0.tar.gz">midatrie-0.3.0</a> (23 Mar 2001) </li>
</ul>
<a name="References">
<h2>References</h2>
</a>
<ol>
    <li><a name="Ref_Knuth1972"><strong>[Knuth1972]</strong> Knuth, D. E. <strong>The Art of Computer Programming Vol. 3, Sorting and Searching.</strong> Addison-Wesley. 1972.</a>
    <li><a name="Ref_Fredkin1960"><strong>[Fredkin1960]</strong> Fredkin, E. <cite>Trie Memory.</cite> <strong>Communication of the ACM.</strong> Vol. 3:9 (Sep 1960). pp. 490-499.</a>
    <li><a name="Ref_Cohen1990"><strong>[Cohen1990]</strong> Cohen, D. <strong>Introduction to Theory of Computing.</strong> John Wiley &amp; Sons. 1990.</a>
    <li><a name="Ref_Johnson1975"><strong>[Johnson1975]</strong> Johnson, S. C. <strong>YACC-Yet another compiler-compiler.</strong> Bell Lab. NJ. Computing Science Technical Report 32. pp.1-34. 1975.</a>
    <li><a name="Ref_Aho+1985"><strong>[Aho+1985]</strong> Aho, A. V., Sethi, R., Ullman, J. D. <strong>Compilers : Principles, Techniques, and Tools.</strong> Addison-Wesley. 1985.
    <li><a name="Ref_Aoe1989"><strong>[Aoe1989]</strong> Aoe, J. <cite>An Efficient Digital Search Algorithm by Using a Double-Array Structure.</cite> <strong>IEEE Transactions on Software Engineering.</strong> Vol. 15, 9 (Sep 1989). pp. 1066-1077.
    <li><a name="Ref_Virach1993"><strong>[Virach+1993]</strong> Virach Sornlertlamvanich, Apichit Pittayaratsophon, Kriangchai Chansaenwilai. <cite>Thai Dictionary Data Base Manipulation using Multi-indexed Double Array Trie.</cite> <strong>5th Annual Conference.</strong> National Electronics and Computer Technology Center. Bangkok. 1993. pp 197-206. (in Thai) </li>
</ol>
<hr />
<p class="footer"><em>Theppitak Karoonboonyanan</em><br />
Created: 1999-06-13<br />
Last Updated 2006-09-18<br />
<em><strong><a href="http://linux.thai.net/~thep/">Back to Theppitak's Homepage</a></strong></em><br />
<!-- Since 2005-08-22 --><!-- Start of StatCounter Code --><script type="text/javascript">
<!-- var sc_project="873193;" 
var sc_partition="7;" 
var sc_security="a5f69ece" ; 
//-->
</script><script src="http://www.statcounter.com/counter/counter_xhtml.js" type="text/javascript"></script>
<div class="statcounter"><a class="statcounter" href="http://www.statcounter.com/" target="_blank"><img alt="StatCounter - Free Web Tracker and Counter" src="http://c8.statcounter.com/t.php?sc_project=873193&amp;resolution=1024&amp;h=768&amp;camefrom=http%3A//www.google.cn/search%3Fcomplete%3D1%26hl%3Dzh-CN%26newwindow%3D1%26q%3Ddouble+array+trie%26meta%3D%26aq%3Df&amp;u=http%3A//linux.thai.net/%7Ethep/datrie/datrie.html&amp;t=An%20Implementation%20of%20Double-Array%20Trie&amp;java=1&amp;security=a5f69ece&amp;sc_random=0.7093694160078857" border="0" /></a></div>
<noscript>
<div class="statcounter"><a class="statcounter" href="http://www.statcounter.com/"><img class="statcounter" src="http://c8.statcounter.com/counter.php?sc_project=873193&amp;java=0&amp;security=a5f69ece" alt="free html hit counter"    /></a></div>
</noscript><!-- End of StatCounter Code -->
<p>&nbsp;</p>
<p class="copyright">Copyright &#169; 1999 by Theppitak Karoonboonyanan, Software and Language Engineering Laboratory, National Electronics and Computer Technology Center. All right reserved. </p>
<p class="copyright">Copyright &#169; 2003 by Theppitak Karoonboonyanan. All right reserved. </p>
<p class="copyright">Copyright &#169; 2006 by Theppitak Karoonboonyanan. All right reserved. </p>
<img src ="http://www.cnblogs.com/Alacky/aggbug/1224005.html?type=2" width = "1" height = "1" /><br><br><a href="http://news.cnblogs.com/n/42944/" target="_blank">[新闻]十年祭:昔日明星软件今何在?</a><br/><a href="http://www.cnblogs.com" target="_blank">博客园首页</a>&nbsp;<a href="http://space.cnblogs.com" target="_blank">社区</a>&nbsp;<a href="http://news.cnblogs.com" target="_blank">新闻频道</a>&nbsp;<a href="http://space.cnblogs.com/group.htm" target="_blank">小组</a>&nbsp;<a href="http://space.cnblogs.com/q" target="_blank">博问</a>&nbsp;<a href="http://wz.cnblogs.com/" target="_blank">网摘</a>&nbsp;<a href="http://space.cnblogs.com/ing" target="_blank">闪存</a>]]></description></item><item><title>引用lib库的编译模式问题(Debug/Release)</title><link>http://www.cnblogs.com/Alacky/archive/2008/01/24/1051712.html</link><dc:creator>*Alacky</dc:creator><author>*Alacky</author><pubDate>Thu, 24 Jan 2008 07:44:00 GMT</pubDate><guid>http://www.cnblogs.com/Alacky/archive/2008/01/24/1051712.html</guid><wfw:comment>http://www.cnblogs.com/Alacky/comments/1051712.html</wfw:comment><comments>http://www.cnblogs.com/Alacky/archive/2008/01/24/1051712.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnblogs.com/Alacky/comments/commentRss/1051712.html</wfw:commentRss><trackback:ping>http://www.cnblogs.com/Alacky/services/trackbacks/1051712.html</trackback:ping><description><![CDATA[&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; C++项目中引用lib库的时候要注意lib的编译模式和当前项目相同,如果不同(lib库为Debug,项目为Release),在编译的时候会有个警告:LINK : warning LNK4098: 默认库&#8220;MSVCRT&#8221;与其他库的使用冲突；请使用 /NODEFAULTLIB:library<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 当然也可以忽略这个警告,就象忽略其它警告一样.但在程序运行的时候就会报一个内存错误,很郁闷的那种=.=,所以,千万不要把库的编译模式搞错了<img src="http://www.cnblogs.com/Emoticons/tusiji/20333097.gif"  alt="" />
<img src ="http://www.cnblogs.com/Alacky/aggbug/1051712.html?type=1" width = "1" height = "1" /><br><br><a href="http://news.cnblogs.com/n/42943/" target="_blank">[新闻]六大可能出售IT企业名单：SUN领头或被猎走</a><br/><a href="http://www.cnblogs.com" target="_blank">博客园首页</a>&nbsp;<a href="http://space.cnblogs.com" target="_blank">社区</a>&nbsp;<a href="http://news.cnblogs.com" target="_blank">新闻频道</a>&nbsp;<a href="http://space.cnblogs.com/group.htm" target="_blank">小组</a>&nbsp;<a href="http://space.cnblogs.com/q" target="_blank">博问</a>&nbsp;<a href="http://wz.cnblogs.com/" target="_blank">网摘</a>&nbsp;<a href="http://space.cnblogs.com/ing" target="_blank">闪存</a>]]></description></item><item><title>[转]正则表达式参考</title><link>http://www.cnblogs.com/Alacky/articles/1009391.html</link><dc:creator>*Alacky</dc:creator><author>*Alacky</author><pubDate>Fri, 21 Dec 2007 08:57:00 GMT</pubDate><guid>http://www.cnblogs.com/Alacky/articles/1009391.html</guid><description><![CDATA[<p><font size="1">[原创文章，转载请保留或注明出处：<a href="http://www.regexlab.com/zh/regref.htm">http://www.regexlab.com/zh/regref.htm</a>]</font></p>
<h4><a name="intro"></a><strong>引言</strong></h4>
<p>&nbsp;&nbsp;&nbsp; 正则表达式（regular expression）就是用一个&#8220;字符串&#8221;来描述一个特征，然后去验证另一个&#8220;字符串&#8221;是否符合这个特征。比如 表达式&#8220;ab+&#8221; 描述的特征是&#8220;一个 'a' 和 任意个 'b' &#8221;，那么 'ab', 'abb', 'abbbbbbbbbb' 都符合这个特征。<br />
<br />
&nbsp;&nbsp;&nbsp; 正则表达式可以用来：（1）验证字符串是否符合指定特征，比如验证是否是合法的邮件地址。（2）用来查找字符串，从一个长的文本中查找符合指定特征的字符串，比查找固定字符串更加灵活方便。（3）用来替换，比普通的替换更强大。<br />
<br />
&nbsp;&nbsp; 正则表达式学习起来其实是很简单的，不多的几个较为抽象的概念也很容易理解。之所以很多人感觉正则表达式比较复杂，一方面是因为大多数的文档没有做到由浅入深地讲解，概念上没有注意先后顺序，给读者的理解带来困难；另一方面，各种引擎自带的文档一般都要介绍它特有的功能，然而这部分特有的功能并不是我们首先要理解的。<br />
<br />
&nbsp;&nbsp;&nbsp; 文章中的每一个举例，都可以点击进入到测试页面进行测试。闲话少说，开始。</p>
<blockquote>
<p><font size="4">[<a href="http://www.regexlab.com/download/?/deelx/deelx_zh.chm"><img height="16" src="http://www.regexlab.com/images/chm.gif" width="15" border="0"  alt="" /> 点击下载 chm 版本</a>]</font> - DEELX 正则语法，包含其他高级语法的 chm 版本。</p>
<p><font color="#999999">（<strong>注意</strong>：下载的 chm 版本适合用来查阅，而本文适合用来学习，建议继续阅读。）</font></p>
</blockquote>
<hr color="#fea089" size="1" />
<h4>1. 正则表达式规则</h4>
<h5><a name="common"></a>1.1 普通字符</h5>
<p>&nbsp;&nbsp;&nbsp; 字母、数字、汉字、下划线、以及后边章节中没有特殊定义的标点符号，都是"普通字符"。表达式中的普通字符，在匹配一个字符串的时候，匹配与之相同的一个字符。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=c&amp;txt=abcde">举例1：表达式 "c"，在匹配字符串 "abcde" 时</a>，匹配结果是：成功；匹配到的内容是："c"；匹配到的位置是：开始于2，结束于3。（注：下标从0开始还是从1开始，因当前编程语言的不同而可能不同）<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=bcd&amp;txt=abcde">举例2：表达式 "bcd"，在匹配字符串 "abcde" 时</a>，匹配结果是：成功；匹配到的内容是："bcd"；匹配到的位置是：开始于1，结束于4。</p>
<hr color="#fea089" size="1" />
<h5><a name="escaped"></a>1.2 简单的转义字符</h5>
<p>&nbsp;&nbsp;&nbsp; 一些不便书写的字符，采用在前面加 "\" 的方法。这些字符其实我们都已经熟知了。</p>
<table style="border-collapse: collapse" cellspacing="0" cellpadding="3" bgcolor="#f8f8f8" border="1">
    <tbody>
        <tr bgcolor="#f0f0f0">
            <td width="70">
            <p>表达式</p>
            </td>
            <td>
            <p>可匹配</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>\r, \n</p>
            </td>
            <td>
            <p>代表回车和换行符</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>\t</p>
            </td>
            <td>
            <p>制表符</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>\\</p>
            </td>
            <td>
            <p>代表 "\" 本身</p>
            </td>
        </tr>
    </tbody>
</table>
<p>&nbsp;&nbsp;&nbsp; 还有其他一些在后边章节中有特殊用处的标点符号，在前面加 "\" 后，就代表该符号本身。比如：^, $ 都有特殊意义，如果要想匹配字符串中 "^" 和 "$" 字符，则表达式就需要写成 "\^" 和 "\$"。</p>
<table style="border-collapse: collapse" cellspacing="0" cellpadding="3" bgcolor="#f8f8f8" border="1">
    <tbody>
        <tr bgcolor="#f0f0f0">
            <td width="66">
            <p>表达式</p>
            </td>
            <td>
            <p>可匹配</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>\^</p>
            </td>
            <td>
            <p>匹配 ^ 符号本身</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>\$</p>
            </td>
            <td>
            <p>匹配 $ 符号本身</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>\.</p>
            </td>
            <td>
            <p>匹配小数点（.）本身</p>
            </td>
        </tr>
    </tbody>
</table>
<p>&nbsp;&nbsp;&nbsp; 这些转义字符的匹配方法与 "普通字符" 是类似的。也是匹配与之相同的一个字符。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=%5C$d&amp;txt=abc$de">举例1：表达式 "\$d"，在匹配字符串 "abc$de" 时</a>，匹配结果是：成功；匹配到的内容是："$d"；匹配到的位置是：开始于3，结束于5。</p>
<hr color="#fea089" size="1" />
<h5><a name="multi"></a>1.3 能够与 '多种字符' 匹配的表达式</h5>
<p>&nbsp;&nbsp;&nbsp; 正则表达式中的一些表示方法，可以匹配 '多种字符' 其中的任意一个字符。比如，表达式 "\d" 可以匹配任意一个数字。虽然可以匹配其中任意字符，但是只能是一个，不是多个。这就好比玩扑克牌时候，大小王可以代替任意一张牌，但是只能代替一张牌。</p>
<table style="border-collapse: collapse" cellspacing="0" cellpadding="3" bgcolor="#f8f8f8" border="1">
    <tbody>
        <tr bgcolor="#f0f0f0">
            <td width="66">
            <p>表达式</p>
            </td>
            <td>
            <p>可匹配</p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#900050">\d</font></font></span></p>
            </td>
            <td>
            <p>任意一个数字，0~9 中的任意一个</p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#900050">\w</font></font></span></p>
            </td>
            <td>
            <p>任意一个字母或数字或下划线，也就是 A~Z,a~z,0~9,_ 中任意一个</p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#900050">\s</font></font></span></p>
            </td>
            <td>
            <p>包括空格、制表符、换页符等空白字符的其中任意一个</p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#900050">.</font></font></span></p>
            </td>
            <td>
            <p>小数点可以匹配除了换行符（\n）以外的任意一个字符</p>
            </td>
        </tr>
    </tbody>
</table>
<p>&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=%5Cd%5Cd&amp;txt=abc123">举例1：表达式 "<span id="pattern" name="pattern"><font color="#000000"><font color="#900050">\d</font><font color="#900050">\d</font></font></span>"，在匹配 "abc123" 时</a>，匹配的结果是：成功；匹配到的内容是："12"；匹配到的位置是：开始于3，结束于5。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=a.%5Cd&amp;txt=aaa100">举例2：表达式 "<span id="pattern" name="pattern"><font color="#000000">a<font color="#900050">.</font><font color="#900050">\d</font></font></span>"，在匹配 "aaa100" 时</a>，匹配的结果是：成功；匹配到的内容是："aa1"；匹配到的位置是：开始于1，结束于4。</p>
<hr color="#fea089" size="1" />
<h5><a name="custom"></a>1.4 自定义能够匹配 '多种字符' 的表达式</h5>
<p>&nbsp;&nbsp;&nbsp; 使用方括号 [ ] 包含一系列字符，能够匹配其中任意一个字符。用 [^ ] 包含一系列字符，则能够匹配其中字符之外的任意一个字符。同样的道理，虽然可以匹配其中任意一个，但是只能是一个，不是多个。</p>
<table style="border-collapse: collapse" cellspacing="0" cellpadding="3" bgcolor="#f8f8f8" border="1">
    <tbody>
        <tr bgcolor="#f0f0f0">
            <td width="80">
            <p>表达式</p>
            </td>
            <td>
            <p>可匹配</p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#900050">[ab5@]</font></font></span></p>
            </td>
            <td>
            <p>匹配 "a" 或 "b" 或 "5" 或 "@"</p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#900050">[^abc]</font></font></span></p>
            </td>
            <td>
            <p>匹配 "a","b","c" 之外的任意一个字符</p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#900050">[f-k]</font></font></span></p>
            </td>
            <td>
            <p>匹配 "f"~"k" 之间的任意一个字母</p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#900050">[^A-F0-3]</font></font></span></p>
            </td>
            <td>
            <p>匹配 "A"~"F","0"~"3" 之外的任意一个字符</p>
            </td>
        </tr>
    </tbody>
</table>
<p>&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=[bcd][bcd]&amp;txt=abc123">举例1：表达式 "<span id="pattern" name="pattern"><font color="#000000"><font color="#900050">[bcd]</font><font color="#900050">[bcd]</font></font></span>" 匹配 "abc123" 时</a>，匹配的结果是：成功；匹配到的内容是："bc"；匹配到的位置是：开始于1，结束于3。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=%5B%5Eabc%5D&amp;txt=abc123">举例2：表达式 "<span id="pattern" name="pattern"><font color="#000000"><font color="#900050">[^abc]</font></font></span>" 匹配 "abc123" 时</a>，匹配的结果是：成功；匹配到的内容是："1"；匹配到的位置是：开始于3，结束于4。</p>
<hr color="#fea089" size="1" />
<h5><a name="times"></a>1.5 修饰匹配次数的特殊符号</h5>
<p>&nbsp;&nbsp;&nbsp; 前面章节中讲到的表达式，无论是只能匹配一种字符的表达式，还是可以匹配多种字符其中任意一个的表达式，都只能匹配一次。如果使用表达式再加上修饰匹配次数的特殊符号，那么不用重复书写表达式就可以重复匹配。<br />
<br />
&nbsp;&nbsp;&nbsp; 使用方法是："次数修饰"放在"被修饰的表达式"后边。比如："[bcd][bcd]" 可以写成 "[bcd]{2}"。</p>
<table style="border-collapse: collapse" cellspacing="0" cellpadding="3" bgcolor="#f8f8f8" border="1">
    <tbody>
        <tr bgcolor="#f0f0f0">
            <td width="67">
            <p>表达式</p>
            </td>
            <td>
            <p>作用</p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#e07000">{n}</font></font></span></p>
            </td>
            <td>
            <p>表达式重复n次，比如：<a href="http://www.regexlab.com/zh/workshop.asp?pat=\w{2}&amp;txt=ab+c6">"\w{2}" 相当于 "\w\w"</a>；<a href="http://www.regexlab.com/zh/workshop.asp?pat=a{5}&amp;txt=bbaaaaaddee">"a{5}" 相当于 "aaaaa"</a></p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#e07000">{m,n}</font></font></span></p>
            </td>
            <td>
            <p>表达式至少重复m次，最多重复n次，比如：<a href="http://www.regexlab.com/zh/workshop.asp?pat=ba{1,3}&amp;txt=a,baaa,baa,b,ba">"ba{1,3}"可以匹配 "ba"或"baa"或"baaa"</a></p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#e07000">{m,}</font></font></span></p>
            </td>
            <td>
            <p>表达式至少重复m次，比如：<a href="http://www.regexlab.com/zh/workshop.asp?pat=\w\d{2,}&amp;txt=b1,a12,_456,_4AA,M12344,12346546547446534543543">"\w\d{2,}"可以匹配 "a12","_456","M12344"...</a></p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#e07000">?</font></font></span></p>
            </td>
            <td>
            <p>匹配表达式0次或者1次，相当于 {0,1}，比如：<a href="http://www.regexlab.com/zh/workshop.asp?pat=a[cd]%3F&amp;txt=a,c,d,ac,ad">"a[cd]?"可以匹配 "a","ac","ad"</a></p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#e07000">+</font></font></span></p>
            </td>
            <td>
            <p>表达式至少出现1次，相当于 {1,}，比如：<a href="http://www.regexlab.com/zh/workshop.asp?pat=a%2Bb&amp;txt=a%2Cb%2Cab%2Caab%2Caaab">"a+b"可以匹配 "ab","aab","aaab"...</a></p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#e07000">*</font></font></span></p>
            </td>
            <td>
            <p>表达式不出现或出现任意次，相当于 {0,}，比如：<a href="http://www.regexlab.com/zh/workshop.asp?pat=%5C%5E*b&amp;txt=%5E%2Cb%2C%5E%5E%5Eb%2C%5E%5E%5E%5E%5E%5E%5Eb">"\^*b"可以匹配 "b","^^^b"...</a></p>
            </td>
        </tr>
    </tbody>
</table>
<p>&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=%5Cd%2B%5C.%3F%5Cd*&amp;txt=It%20costs%20%2412.5">举例1：表达式 "<span id="pattern" name="pattern"><font color="#000000"><font color="#900050">\d</font><font color="#e07000">+</font>\.<font color="#e07000">?</font><font color="#900050">\d</font><font color="#e07000">*</font></font></span>" 在匹配 "It costs $12.5" 时</a>，匹配的结果是：成功；匹配到的内容是："12.5"；匹配到的位置是：开始于10，结束于14。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=go{2,8}gle&amp;txt=Ads%20by%20goooooogle%2C%20or%20gooogle">举例2：表达式 "<span id="pattern" name="pattern"><font color="#000000">go<font color="#e07000">{2,8}</font>gle</font></span>" 在匹配 "Ads by goooooogle" 时</a>，匹配的结果是：成功；匹配到的内容是："goooooogle"；匹配到的位置是：开始于7，结束于17。</p>
<hr color="#fea089" size="1" />
<h5><a name="special"></a>1.6 其他一些代表抽象意义的特殊符号</h5>
<p>&nbsp;&nbsp;&nbsp; 一些符号在表达式中代表抽象的特殊意义：</p>
<table style="border-collapse: collapse" cellspacing="0" cellpadding="3" bgcolor="#f8f8f8" border="1">
    <tbody>
        <tr bgcolor="#f0f0f0">
            <td width="67">
            <p>表达式</p>
            </td>
            <td>
            <p>作用</p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#ff00ff">^</font></font></span></p>
            </td>
            <td>
            <p>与字符串开始的地方匹配，不匹配任何字符</p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#ff00ff">$</font></font></span></p>
            </td>
            <td>
            <p>与字符串结束的地方匹配，不匹配任何字符</p>
            </td>
        </tr>
        <tr>
            <td>
            <p><span id="pattern" name="pattern"><font color="#000000"><font color="#ff00ff">\b</font></font></span></p>
            </td>
            <td>
            <p>匹配一个单词边界，也就是单词和空格之间的位置，不匹配任何字符</p>
            </td>
        </tr>
    </tbody>
</table>
<p>&nbsp;&nbsp;&nbsp; 进一步的文字说明仍然比较抽象，因此，举例帮助大家理解。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=^aaa&amp;txt=xxx+aaa+xxx">举例1：表达式 "<span id="pattern" name="pattern"><font color="#000000"><font color="#ff00ff">^</font>aaa</font></span>" 在匹配 "xxx aaa xxx" 时</a>，匹配结果是：失败。因为 "^" 要求与字符串开始的地方匹配，因此，只有当 "aaa" 位于字符串的开头的时候，"^aaa" 才能匹配，<a href="http://www.regexlab.com/zh/workshop.asp?pat=^aaa&amp;txt=aaa+xxx+xxx">比如："aaa xxx xxx"</a>。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=aaa$&amp;txt=xxx+aaa+xxx">举例2：表达式 "<span id="pattern" name="pattern"><font color="#000000">aaa<font color="#ff00ff">$</font></font></span>" 在匹配 "xxx aaa xxx" 时</a>，匹配结果是：失败。因为 "$" 要求与字符串结束的地方匹配，因此，只有当 "aaa" 位于字符串的结尾的时候，"aaa$" 才能匹配，<a href="http://www.regexlab.com/zh/workshop.asp?pat=aaa$&amp;txt=xxx+xxx+aaa">比如："xxx xxx aaa"</a>。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=.%5Cb.&amp;txt=@@@abc">举例3：表达式 "<span id="pattern" name="pattern"><font color="#000000"><font color="#900050">.</font><font color="#ff00ff">\b</font><font color="#900050">.</font></font></span>" 在匹配 "@@@abc" 时</a>，匹配结果是：成功；匹配到的内容是："@a"；匹配到的位置是：开始于2，结束于4。<br />
&nbsp;&nbsp;&nbsp; 进一步说明："\b" 与 "^" 和 "$" 类似，本身不匹配任何字符，但是它要求它在匹配结果中所处位置的左右两边，其中一边是 "\w" 范围，另一边是 非"\w" 的范围。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=%5Cbend%5Cb&amp;txt=weekend,endfor,end">举例4：表达式 "<span id="pattern" name="pattern"><font color="#000000"><font color="#ff00ff">\b</font>end<font color="#ff00ff">\b</font></font></span>" 在匹配 "weekend,endfor,end" 时</a>，匹配结果是：成功；匹配到的内容是："end"；匹配到的位置是：开始于15，结束于18。</p>
<p>&nbsp;&nbsp;&nbsp; 一些符号可以影响表达式内部的子表达式之间的关系：</p>
<table style="border-collapse: collapse" cellspacing="0" cellpadding="3" bgcolor="#f8f8f8" border="1">
    <tbody>
        <tr bgcolor="#f0f0f0">
            <td width="65">
            <p>表达式</p>
            </td>
            <td>
            <p>作用</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>|</p>
            </td>
            <td>
            <p>左右两边表达式之间 "或" 关系，匹配左边或者右边</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>( )</p>
            </td>
            <td>
            <p>(1). 在被修饰匹配次数的时候，括号中的表达式可以作为整体被修饰<br />
            (2). 取匹配结果的时候，括号中的表达式匹配到的内容可以被单独得到</p>
            </td>
        </tr>
    </tbody>
</table>
<p>&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=Tom%7CJack&amp;txt=I%27m+Tom%2C+he+is+Jack">举例5：表达式 "<span id="pattern" name="pattern"><font color="#000000">Tom<font color="#5050ff">|</font>Jack</font></span>" 在匹配字符串 "I'm Tom, he is Jack" 时</a>，匹配结果是：成功；匹配到的内容是："Tom"；匹配到的位置是：开始于4，结束于7。匹配下一个时，匹配结果是：成功；匹配到的内容是："Jack"；匹配到的位置时：开始于15，结束于19。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=%28go%5Cs*%29%2B&amp;txt=Let%27s%20go%20go%20go%21">举例6：表达式 "<span id="pattern" name="pattern"><font color="#000000"><font color="#5050ff">(</font>go<font color="#900050">\s</font><font color="#e07000">*</font><font color="#5050ff">)</font><font color="#e07000">+</font></font></span>" 在匹配 "Let's go go go!" 时</a>，匹配结果是：成功；匹配到内容是："go go go"；匹配到的位置是：开始于6，结束于14。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=%uFFE5%28%5Cd%2B%5C.%3F%5Cd*%29&amp;txt=%uFF0410.9%2C%uFFE520.5">举例7：表达式 "<span id="pattern" name="pattern"><font color="#000000">￥<font color="#5050ff">(</font><font color="#900050">\d</font><font color="#e07000">+</font>\.<font color="#e07000">?</font><font color="#900050">\d</font><font color="#e07000">*</font><font color="#5050ff">)</font></font></span>" 在匹配 "＄10.9,￥20.5" 时</a>，匹配的结果是：成功；匹配到的内容是："￥20.5"；匹配到的位置是：开始于6，结束于10。单独获取括号范围匹配到的内容是："20.5"。</p>
<hr color="#fea089" size="1" />
<h4>2. 正则表达式中的一些高级规则</h4>
<h5><a name="reluctant"></a>2.1 匹配次数中的贪婪与非贪婪</h5>
<p>&nbsp;&nbsp;&nbsp; 在使用修饰匹配次数的特殊符号时，有几种表示方法可以使同一个表达式能够匹配不同的次数，比如："{m,n}", "{m,}", "?", "*", "+"，具体匹配的次数随被匹配的字符串而定。这种重复匹配不定次数的表达式在匹配过程中，总是尽可能多的匹配。比如，针对文本 "dxxxdxxxd"，举例如下：</p>
<table style="border-collapse: collapse" cellspacing="0" cellpadding="3" bgcolor="#f8f8f8" border="1">
    <tbody>
        <tr bgcolor="#f0f0f0">
            <td width="93">
            <p>表达式</p>
            </td>
            <td>
            <p>匹配结果</p>
            </td>
        </tr>
        <tr>
            <td>
            <p><a href="http://www.regexlab.com/zh/workshop.asp?pat=(d)(%5Cw%2B)&amp;txt=dxxxdxxxd"><span id="pattern" name="pattern"><font color="#000000"><font color="#5050ff">(</font>d<font color="#5050ff">)</font><font color="#5050ff">(</font><font color="#900050">\w</font><font color="#e07000">+</font><font color="#5050ff">)</font></font></span></a></p>
            </td>
            <td>
            <p>"\w+" 将匹配第一个 "d" 之后的所有字符 "xxxdxxxd"</p>
            </td>
        </tr>
        <tr>
            <td>
            <p><a href="http://www.regexlab.com/zh/workshop.asp?pat=(d)(%5Cw%2B)(d)&amp;txt=dxxxdxxxd"><span id="pattern" name="pattern"><font color="#000000"><font color="#5050ff">(</font>d<font color="#5050ff">)</font><font color="#5050ff">(</font><font color="#900050">\w</font><font color="#e07000">+</font><font color="#5050ff">)</font><font color="#5050ff">(</font>d<font color="#5050ff">)</font></font></span></a></p>
            </td>
            <td>
            <p>"\w+" 将匹配第一个 "d" 和最后一个 "d" 之间的所有字符 "xxxdxxx"。虽然 "\w+" 也能够匹配上最后一个 "d"，但是为了使整个表达式匹配成功，"\w+" 可以 "让出" 它本来能够匹配的最后一个 "d"</p>
            </td>
        </tr>
    </tbody>
</table>
<p>&nbsp;&nbsp;&nbsp; 由此可见，"\w+" 在匹配的时候，总是尽可能多的匹配符合它规则的字符。虽然第二个举例中，它没有匹配最后一个 "d"，但那也是为了让整个表达式能够匹配成功。同理，带 "*" 和 "{m,n}" 的表达式都是尽可能地多匹配，带 "?" 的表达式在可匹配可不匹配的时候，也是尽可能的 "要匹配"。这 种匹配原则就叫作 "贪婪" 模式 。</p>
<p>&nbsp;&nbsp;&nbsp; 非贪婪模式：<br />
<br />
&nbsp;&nbsp;&nbsp; 在修饰匹配次数的特殊符号后再加上一个 "?" 号，则可以使匹配次数不定的表达式尽可能少的匹配，使可匹配可不匹配的表达式，尽可能的 "不匹配"。这种匹配原则叫作 "非贪婪" 模式，也叫作 "勉强" 模式。如果少匹配就会导致整个表达式匹配失败的时候，与贪婪模式类似，非贪婪模式会最小限度的再匹配一些，以使整个表达式匹配成功。举例如下，针对文本 "dxxxdxxxd" 举例：</p>
<table style="border-collapse: collapse" cellspacing="0" cellpadding="3" bgcolor="#f8f8f8" border="1">
    <tbody>
        <tr bgcolor="#f0f0f0">
            <td width="93">
            <p>表达式</p>
            </td>
            <td>
            <p>匹配结果</p>
            </td>
        </tr>
        <tr>
            <td>
            <p><a href="http://www.regexlab.com/zh/workshop.asp?pat=(d)(%5Cw%2B%3F)&amp;txt=dxxxdxxxd"><span id="pattern" name="pattern"><font color="#000000"><font color="#5050ff">(</font>d<font color="#5050ff">)</font><font color="#5050ff">(</font><font color="#900050">\w</font><font color="#e07000">+</font><font color="#e07000">?</font><font color="#5050ff">)</font></font></span></a></p>
            </td>
            <td>
            <p>"\w+?" 将尽可能少的匹配第一个 "d" 之后的字符，结果是："\w+?" 只匹配了一个 "x"</p>
            </td>
        </tr>
        <tr>
            <td>
            <p><a href="http://www.regexlab.com/zh/workshop.asp?pat=(d)(%5Cw%2B%3F)(d)&amp;txt=dxxxdxxxd"><span id="pattern" name="pattern"><font color="#000000"><font color="#5050ff">(</font>d<font color="#5050ff">)</font><font color="#5050ff">(</font><font color="#900050">\w</font><font color="#e07000">+</font><font color="#e07000">?</font><font color="#5050ff">)</font><font color="#5050ff">(</font>d<font color="#5050ff">)</font></font></span></a></p>
            </td>
            <td>
            <p>为了让整个表达式匹配成功，"\w+?" 不得不匹配 "xxx" 才可以让后边的 "d" 匹配，从而使整个表达式匹配成功。因此，结果是："\w+?" 匹配 "xxx"</p>
            </td>
        </tr>
    </tbody>
</table>
<p>&nbsp;&nbsp;&nbsp; 更多的情况，举例如下：<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=%3Ctd%3E%28%2E%2A%29%3C%2Ftd%3E&amp;txt=%3Ctd%3E%3Cp%3Eaa%3C%2Fp%3E%3C%2Ftd%3E%3Ctd%3E%3Cp%3Ebb%3C%2Fp%3E%3C%2Ftd%3E">举例1：表达式 "<span id="pattern" name="pattern"><font color="#000000">&lt;td&gt;<font color="#5050ff">(</font><font color="#900050">.</font><font color="#e07000">*</font><font color="#5050ff">)</font>&lt;/td&gt;</font></span>" 与字符串 "&lt;td&gt;&lt;p&gt;aa&lt;/p&gt;&lt;/td&gt; &lt;td&gt;&lt;p&gt;bb&lt;/p&gt;&lt;/td&gt;" 匹配时</a>，匹配的结果是：成功；匹配到的内容是 "&lt;td&gt;&lt;p&gt;aa&lt;/p&gt;&lt;/td&gt; &lt;td&gt;&lt;p&gt;bb&lt;/p&gt;&lt;/td&gt;" 整个字符串， 表达式中的 "&lt;/td&gt;" 将与字符串中最后一个 "&lt;/td&gt;" 匹配。 <br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=%3Ctd%3E%28%2E%2A%3F%29%3C%2Ftd%3E&amp;txt=%3Ctd%3E%3Cp%3Eaa%3C%2Fp%3E%3C%2Ftd%3E%3Ctd%3E%3Cp%3Ebb%3C%2Fp%3E%3C%2Ftd%3E">举例2：相比之下，表达式 "<span id="pattern" name="pattern"><font color="#000000">&lt;td&gt;<font color="#5050ff">(</font><font color="#900050">.</font><font color="#e07000">*</font><font color="#e07000">?</font><font color="#5050ff">)</font>&lt;/td&gt;</font></span>" 匹配举例1中同样的字符串时</a>，将只得到 "&lt;td&gt;&lt;p&gt;aa&lt;/p&gt;&lt;/td&gt;"， 再次匹配下一个时，可以得到第二个 "&lt;td&gt;&lt;p&gt;bb&lt;/p&gt;&lt;/td&gt;"。</p>
<hr color="#fea089" size="1" />
<h5><a name="backref"></a>2.2 反向引用 \1, \2...</h5>
<p>&nbsp;&nbsp;&nbsp; 表达式在匹配时，表达式引擎会将小括号 "( )" 包含的表达式所匹配到的字符串记录下来。在获取匹配结果的时候，小括号包含的表达式所匹配到的字符串可以单独获取。这一点，在前面的举例中，已经多次展示了。在实际应用场合中，当用某种边界来查找，而所要获取的内容又不包含边界时，必须使用小括号来指定所要的范围。比如前面的 "<span id="pattern" name="pattern"><font color="#000000">&lt;td&gt;<font color="#5050ff">(</font><font color="#900050">.</font><font color="#e07000">*</font><font color="#e07000">?</font><font color="#5050ff">)</font>&lt;/td&gt;</font></span>"。<br />
<br />
&nbsp;&nbsp;&nbsp; 其实，"小括号包含的表达式所匹配到的字符串" 不仅是在匹配结束后才可以使用，在匹配过程中也可以使用。表达式后边的部分，可以引用前面 "括号内的子匹配已经匹配到的字符串"。引用方法是 "\" 加上一个数字。"\1" 引用第1对括号内匹配到的字符串，"\2" 引用第2对括号内匹配到的字符串&#8230;&#8230;以此类推，如果一对括号内包含另一对括号，则外层的括号先排序号。换句话说，哪一对的左括号 "(" 在前，那这一对就先排序号。</p>
<p>&nbsp;&nbsp;&nbsp; 举例如下：<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=%28%27%7C%22%29%28%2E%2A%3F%29%28%5C1%29&amp;txt=%27Hello%27%2C+%22World%22">举例1：表达式 "<span id="pattern" name="pattern"><font color="#000000"><font color="#5050ff">(</font>'<font color="#5050ff">|</font>"<font color="#5050ff">)</font><font color="#5050ff">(</font><font color="#900050">.</font><font color="#e07000">*</font><font color="#e07000">?</font><font color="#5050ff">)</font><font color="#5050ff">(</font><font color="#ff00ff">\1</font><font color="#5050ff">)</font></font></span>" 在匹配 " 'Hello', "World" " 时</a>，匹配结果是：成功；匹配到的内容是：" 'Hello' "。再次匹配下一个时，可以匹配到 " "World" "。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=%28%5Cw%29%5C1%7B4%2C%7D&amp;txt=aa%20bbbb%20abcdefg%20ccccc%20111121111%20999999999">举例2：表达式 "<span id="pattern" name="pattern"><font color="#000000"><font color="#5050ff">(</font><font color="#900050">\w</font><font color="#5050ff">)</font><font color="#ff00ff">\1</font><font color="#e07000">{4,}</font></font></span>" 在匹配 "aa bbbb abcdefg ccccc 111121111 999999999" 时</a>，匹配结果是：成功；匹配到的内容是 "ccccc"。再次匹配下一个时，将得到 999999999。这个表达式要求 "\w" 范围的字符至少重复5次，<a href="http://www.regexlab.com/zh/workshop.asp?pat=%5Cw%7B5%2C%7D&amp;txt=aa%20bbbb%20abcdefg%20ccccc%20111121111%20999999999">注意与 "\w{5,}" 之间的区别</a>。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=%3C%28%5Cw%2B%29%5Cs%2A%28%5Cw%2B%28%3D%28%27%7C%22%29%2E%2A%3F%5C4%29%3F%5Cs%2A%29%2A%3E%2E%2A%3F%3C%2F%5C1%3E&amp;txt=%3Ctd+id%3D%27td1%27+style%3D%22bgcolor%3Awhite%22%3E%3C%2Ftd%3E%0D%0A%3Cbody+onload%3D%22doit%28%29%22%3E%3C%2Fbody%3E">举例3：表达式 "<span id="pattern" name="pattern"><font color="#000000">&lt;<font color="#5050ff">(</font><font color="#900050">\w</font><font color="#e07000">+</font><font color="#5050ff">)</font><font color="#900050">\s</font><font color="#e07000">*</font><font color="#5050ff">(</font><font color="#900050">\w</font><font color="#e07000">+</font><font color="#5050ff">(</font>=<font color="#5050ff">(</font>'<font color="#5050ff">|</font>"<font color="#5050ff">)</font><font color="#900050">.</font><font color="#e07000">*</font><font color="#e07000">?</font><font color="#ff00ff">\4</font><font color="#5050ff">)</font><font color="#e07000">?</font><font color="#900050">\s</font><font color="#e07000">*</font><font color="#5050ff">)</font><font color="#e07000">*</font>&gt;<font color="#900050">.</font><font color="#e07000">*</font><font color="#e07000">?</font>&lt;/<font color="#ff00ff">\1</font>&gt;</font></span>" 在匹配 "&lt;td id='td1' style="bgcolor:white"&gt;&lt;/td&gt;" 时</a>，匹配结果是成功。如果 "&lt;td&gt;" 与 "&lt;/td&gt;" 不配对，则会匹配失败；如果改成其他配对，也可以匹配成功。</p>
<hr color="#fea089" size="1" />
<h5><a name="forward"></a>2.3 预搜索，不匹配；反向预搜索，不匹配</h5>
<p>&nbsp;&nbsp;&nbsp; 前面的章节中，我讲到了几个代表抽象意义的特殊符号："^"，"$"，"\b"。它们都有一个共同点，那就是：它们本身不匹配任何字符，只是对 "字符串的两头" 或者 "字符之间的缝隙" 附加了一个条件。理解到这个概念以后，本节将继续介绍另外一种对 "两头" 或者 "缝隙" 附加条件的，更加灵活的表示方法。</p>
<p>&nbsp;&nbsp;&nbsp; 正向预搜索："(?=xxxxx)"，"(?!xxxxx)"<br />
<br />
&nbsp;&nbsp;&nbsp; 格式："(?=xxxxx)"，在被匹配的字符串中，它对所处的 "缝隙" 或者 "两头" 附加的条件是：所在缝隙的右侧，必须能够匹配上 xxxxx 这部分的表达式。因为它只是在此作为这个缝隙上附加的条件，所以它并不影响后边的表达式去真正匹配这个缝隙之后的字符。这就类似 "\b"，本身不匹配任何字符。"\b" 只是将所在缝隙之前、之后的字符取来进行了一下判断，不会影响后边的表达式来真正的匹配。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=Windows+%28%3F%3DNT%7CXP%29&amp;txt=Windows+98%2C+Windows+NT%2C+Windows+2000">举例1：表达式 "<span id="pattern" name="pattern"><font color="#000000">Windows <font color="#999999">(?=</font>NT<font color="#5050ff">|</font>XP<font color="#999999">)</font></font></span>" 在匹配 "Windows 98, Windows NT, Windows 2000" 时</a>，将只匹配 "Windows NT" 中的 "Windows "，其他的 "Windows " 字样则不被匹配。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=%28%5Cw%29%28%28%3F%3D%5C1%5C1%5C1%29%28%5C1%29%29%2B&amp;txt=aaa+ffffff+999999999">举例2：表达式 "<span id="pattern" name="pattern"><font color="#000000"><font color="#5050ff">(</font><font color="#900050">\w</font><font color="#5050ff">)</font><font color="#5050ff">(</font><font color="#999999">(?=</font><font color="#ff00ff">\1</font><font color="#ff00ff">\1</font><font color="#ff00ff">\1</font><font color="#999999">)</font><font color="#5050ff">(</font><font color="#ff00ff">\1</font><font color="#5050ff">)</font><font color="#5050ff">)</font><font color="#e07000">+</font></font></span>" 在匹配字符串 "aaa ffffff 999999999" 时</a>，将可以匹配6个"f"的前4个，可以匹配9个"9"的前7个。这个表达式可以读解成：重复4次以上的字母数字，则匹配其剩下最后2位之前的部分。当然，这个表达式可以不这样写，在此的目的是作为演示之用。</p>
<p>&nbsp;&nbsp;&nbsp; 格式："(?!xxxxx)"，所在缝隙的右侧，必须不能匹配 xxxxx 这部分表达式。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=%28%28%3F%21%5Cbstop%5Cb%29%2E%29%2B&amp;txt=fdjka+ljfdl+stop+fjdsla+fdj">举例3：表达式 "<span id="pattern" name="pattern"><font color="#000000"><font color="#5050ff">(</font><font color="#999999">(?!</font><font color="#ff00ff">\b</font>stop<font color="#ff00ff">\b</font><font color="#999999">)</font><font color="#900050">.</font><font color="#5050ff">)</font><font color="#e07000">+</font></font></span>" 在匹配 "fdjka ljfdl stop fjdsla fdj" 时</a>，将从头一直匹配到 "stop" 之前的位置，如果字符串中没有 "stop"，则匹配整个字符串。<br />
<br />
&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=do%28%3F%21%5Cw%29&amp;txt=done%2C+do%2C+dog">举例4：表达式 "<span id="pattern" name="pattern"><font color="#000000">do<font color="#999999">(?!</font><font color="#900050">\w</font><font color="#999999">)</font></font></span>" 在匹配字符串 "done, do, dog" 时</a>，只能匹配 "do"。在本条举例中，"do" 后边使用 "(?!\w)" 和使用 "\b" 效果是一样的。</p>
<p>&nbsp;&nbsp;&nbsp; 反向预搜索："(?&lt;=xxxxx)"，"(?&lt;!xxxxx)"<br />
<br />
&nbsp;&nbsp;&nbsp; 这两种格式的概念和正向预搜索是类似的，反向预搜索要求的条件是：所在缝隙的 "左侧"，两种格式分别要求必须能够匹配和必须不能够匹配指定表达式，而不是去判断右侧。与 "正向预搜索" 一样的是：它们都是对所在缝隙的一种附加条件，本身都不匹配任何字符。<br />
<br />
&nbsp;&nbsp;&nbsp; 举例5：表达式 "<span id="pattern" name="pattern"><font color="#000000"><font color="#999999">(?&lt;=</font><font color="#900050">\d</font><font color="#e07000">{4}</font><font color="#999999">)</font><font color="#900050">\d</font><font color="#e07000">+</font><font color="#999999">(?=</font><font color="#900050">\d</font><font color="#e07000">{4}</font><font color="#999999">)</font></font></span>" 在匹配 "1234567890123456" 时，将匹配除了前4个数字和后4个数字之外的中间8个数字。由于 JScript.RegExp 不支持反向预搜索，因此，本条举例不能够进行演示。很多其他的引擎可以支持反向预搜索，比如：Java 1.4 以上的 java.util.regex 包，.NET 中System.Text.RegularExpressions 命名空间，以及本站推荐的<a href="http://www.regexlab.com/zh/deelx/">最简单易用的 DEELX 正则引擎</a>。</p>
<hr color="#fea089" size="1" />
<h4><a name="othercommon"></a>3. 其他通用规则</h4>
<p>&nbsp;&nbsp;&nbsp; 还有一些在各个正则表达式引擎之间比较通用的规则，在前面的讲解过程中没有提到。</p>
<p>3.1 表达式中，可以使用 "\xXX" 和 "\uXXXX" 表示一个字符（"X" 表示一个十六进制数）</p>
<table style="border-collapse: collapse" cellspacing="0" cellpadding="3" bgcolor="#f8f8f8" border="1">
    <tbody>
        <tr bgcolor="#f0f0f0">
            <td width="63">
            <p>形式</p>
            </td>
            <td>
            <p>字符范围</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>\xXX</p>
            </td>
            <td>
            <p>编号在 0 ~ 255 范围的字符，比如：<a href="http://www.regexlab.com/zh/workshop.asp?pat=%5Cx20&amp;txt=It+is%2E">空格可以使用 "\x20" 表示</a></p>
            </td>
        </tr>
        <tr>
            <td>
            <p>\uXXXX</p>
            </td>
            <td>
            <p>任何字符可以使用 "\u" 再加上其编号的4位十六进制数表示，比如：<a href="http://www.regexlab.com/zh/workshop.asp?pat=%5Cu4E2D&amp;txt=%D6%D0%B9%FA">"\u4E2D"</a></p>
            </td>
        </tr>
    </tbody>
</table>
<p>3.2 在表达式 "\s"，"\d"，"\w"，"\b" 表示特殊意义的同时，对应的大写字母表示相反的意义</p>
<table style="border-collapse: collapse" cellspacing="0" cellpadding="3" bgcolor="#f8f8f8" border="1">
    <tbody>
        <tr bgcolor="#f0f0f0">
            <td width="55">
            <p>表达式</p>
            </td>
            <td>
            <p>可匹配</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>\S</p>
            </td>
            <td>
            <p><a href="http://www.regexlab.com/zh/workshop.asp?pat=%5CS%2B&amp;txt=abc+123+%40%23%24%25">匹配所有非空白字符（"\s" 可匹配各个空白字符）</a></p>
            </td>
        </tr>
        <tr>
            <td>
            <p>\D</p>
            </td>
            <td>
            <p><a href="http://www.regexlab.com/zh/workshop.asp?pat=%5CD%2B&amp;txt=abc+123+%40%23%24%25">匹配所有的非数字字符</a></p>
            </td>
        </tr>
        <tr>
            <td>
            <p>\W</p>
            </td>
            <td>
            <p><a href="http://www.regexlab.com/zh/workshop.asp?pat=%5CW%2B&amp;txt=abc+123+%40%23%24%25">匹配所有的字母、数字、下划线以外的字符</a></p>
            </td>
        </tr>
        <tr>
            <td>
            <p>\B</p>
            </td>
            <td>
            <p><a href="http://www.regexlab.com/zh/workshop.asp?pat=%5CB%2E%5CB&amp;txt=abc+123+%40%23%24%25">匹配非单词边界，即左右两边都是 "\w" 范围或者左右两边都不是 "\w" 范围时的字符缝隙</a></p>
            </td>
        </tr>
    </tbody>
</table>
<p>3.3 在表达式中有特殊意义，需要添加 "\" 才能匹配该字符本身的字符汇总</p>
<table style="border-collapse: collapse" cellspacing="0" cellpadding="3" bgcolor="#f8f8f8" border="1">
    <tbody>
        <tr bgcolor="#f0f0f0">
            <td width="55">
            <p>字符</p>
            </td>
            <td>
            <p>说明</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>^</p>
            </td>
            <td>
            <p>匹配输入字符串的开始位置。要匹配 "^" 字符本身，请使用 "\^"</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>$</p>
            </td>
            <td>
            <p>匹配输入字符串的结尾位置。要匹配 "$" 字符本身，请使用 "\$"</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>( )</p>
            </td>
            <td>
            <p>标记一个子表达式的开始和结束位置。要匹配小括号，请使用 "\(" 和 "\)"</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>[ ]</p>
            </td>
            <td>
            <p>用来自定义能够匹配 '多种字符' 的表达式。要匹配中括号，请使用 "\[" 和 "\]"</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>{ }</p>
            </td>
            <td>
            <p>修饰匹配次数的符号。要匹配大括号，请使用 "\{" 和 "\}"</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>.</p>
            </td>
            <td>
            <p>匹配除了换行符（\n）以外的任意一个字符。要匹配小数点本身，请使用 "\."</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>?</p>
            </td>
            <td>
            <p>修饰匹配次数为 0 次或 1 次。要匹配 "?" 字符本身，请使用 "\?"</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>+</p>
            </td>
            <td>
            <p>修饰匹配次数为至少 1 次。要匹配 "+" 字符本身，请使用 "\+"</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>*</p>
            </td>
            <td>
            <p>修饰匹配次数为 0 次或任意次。要匹配 "*" 字符本身，请使用 "\*"</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>|</p>
            </td>
            <td>
            <p>左右两边表达式之间 "或" 关系。匹配 "|" 本身，请使用 "\|"</p>
            </td>
        </tr>
    </tbody>
</table>
<p>3.4 括号 "( )" 内的子表达式，如果希望匹配结果不进行记录供以后使用，可以使用 "(?:xxxxx)" 格式</p>
<p>&nbsp;&nbsp;&nbsp; <a href="http://www.regexlab.com/zh/workshop.asp?pat=%28%3F%3A%28%5Cw%29%5C1%29%2B&amp;txt=a bbccdd efg">举例1：表达式 "<span id="pattern" name="pattern"><font color="#000000"><font color="#999999">(?:</font><font color="#5050ff">(</font><font color="#900050">\w</font><font color="#5050ff">)</font><font color="#ff00ff">\1</font><font color="#999999">)</font><font color="#e07000">+</font></font></span>" 匹配 "a bbccdd efg" 时</a>，结果是 "bbccdd"。括号 "(?:)" 范围的匹配结果不进行记录，因此 "(\w)" 使用 "\1" 来引用。</p>
<p>3.5 常用的表达式属性设置简介：Ignorecase，Singleline，Multiline，Global</p>
<table style="border-collapse: collapse" cellspacing="0" cellpadding="3" bgcolor="#f8f8f8" border="1">
    <tbody>
        <tr bgcolor="#f0f0f0">
            <td width="80">
            <p>表达式属性</p>
            </td>
            <td>
            <p>说明</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>Ignorecase</p>
            </td>
            <td>
            <p>默认情况下，表达式中的字母是要区分大小写的。配置为 Ignorecase 可使匹配时不区分大小写。有的表达式引擎，把 "大小写" 概念延伸至 UNICODE 范围的大小写。</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>Singleline</p>
            </td>
            <td>
            <p>默认情况下，小数点 "." 匹配除了换行符（\n）以外的字符。配置为 Singleline 可使小数点可匹配包括换行符在内的所有字符。</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>Multiline</p>
            </td>
            <td>
            <p>默认情况下，表达式 "^" 和 "$" 只匹配字符串的开始 ① 和结尾 ④ 位置。如：<br />
            <br />
            ①xxxxxxxxx②\n<br />
            ③xxxxxxxxx④<br />
            <br />
            配置为 Multiline 可以使 "^" 匹配 ① 外，还可以匹配换行符之后，下一行开始前 ③ 的位置，使 "$" 匹配 ④ 外，还可以匹配换行符之前，一行结束 ② 的位置。</p>
            </td>
        </tr>
        <tr>
            <td>
            <p>Global</p>
            </td>
            <td>
            <p>主要在将表达式用来替换时起作用，配置为 Global 表示替换所有的匹配。</p>
            </td>
        </tr>
    </tbody>
</table>
<p>
<hr color="#fea089" size="1" />
<p>&nbsp;</p>
<h4><a name="prompt"></a>4. 其他提示</h4>
<p>4.1 如果想要了解高级的正则引擎还支持那些复杂的正则语法，可参见<a href="http://www.regexlab.com/zh/deelx/syntax.htm">本站 DEELX 正则引擎的说明文档</a>。</p>
<p>4.2 如果要要求表达式所匹配的内容是整个字符串，而不是从字符串中找一部分，那么可以在表达式的首尾使用 "^" 和 "$"，比如："<span id="pattern" name="pattern"><font color="#000000"><font color="#ff00ff">^</font><font color="#900050">\d</font><font color="#e07000">+</font><font color="#ff00ff">$</font></font></span>" 要求整个字符串只有数字。</p>
<p>4.3 如果要求匹配的内容是一个完整的单词，而不会是单词的一部分，那么在表达式首尾使用 "\b"，比如：<a href="http://www.regexlab.com/zh/workshop.asp?pat=%5Cb%28if%7Cwhile%7Celse%7Cvoid%7Cint%29%5Cb&amp;txt=if%28ifdo%29%0D%0A++++dosome%28%29%3B%0D%0Aelse%0D%0A++++doelse%28%29%3B">使用 "<span id="pattern" name="pattern"><font color="#000000"><font color="#ff00ff">\b</font><font color="#5050ff">(</font>if<font color="#5050ff">|</font>while<font color="#5050ff">|</font>else<font color="#5050ff">|</font>void<font color="#5050ff">|</font>int&#8230;&#8230;<font color="#5050ff">)</font><font color="#ff00ff">\b</font></font></span>" 来匹配程序中的关键字</a>。</p>
<p>4.4 表达式不要匹配空字符串。否则会一直得到匹配成功，而结果什么都没有匹配到。比如：准备写一个匹配 "123"、"123."、"123.5"、".5" 这几种形式的表达式时，整数、小数点、小数数字都可以省略，但是不要将表达式写成："<span id="pattern" name="pattern"><font color="#000000"><font color="#900050">\d</font><font color="#e07000">*</font>\.<font color="#e07000">?</font><font color="#900050">\d</font><font color="#e07000">*</font></font></span>"，因为如果什么都没有，这个表达式也可以匹配成功。<a href="http://www.regexlab.com/zh/workshop.asp?pat=%5Cd%2B%5C%2E%3F%5Cd%2A%7C%5C%2E%5Cd%2B&amp;txt=123%2C+123%2E%2C+123%2E5%2C+%2E5%2C+%2E">更好的写法是："<span id="pattern" name="pattern"><font color="#000000"><font color="#900050">\d</font><font color="#e07000">+</font>\.<font color="#e07000">?</font><font color="#900050">\d</font><font color="#e07000">*</font><font color="#5050ff">|</font>\.<font color="#900050">\d</font><font color="#e07000">+</font></font></span>"</a>。</p>
<p>4.5 能匹配空字符串的子匹配不要循环无限次。如果括号内的子表达式中的每一部分都可以匹配 0 次，而这个括号整体又可以匹配无限次，那么情况可能比上一条所说的更严重，匹配过程中可能死循环。虽然现在有些正则表达式引擎已经通过办法避免了这种情况出现死循环了，比如 .NET 的正则表达式，但是我们仍然应该尽量避免出现这种情况。如果我们在写表达式时遇到了死循环，也可以从这一点入手，查找一下是否是本条所说的原因。</p>
<p>4.6 合理选择贪婪模式与非贪婪模式，参见<a href="http://www.regexlab.com/zh/regtopic.htm#reluctant">话题讨论</a>。</p>
<p>4.7 或 "|" 的左右两边，对某个字符最好只有一边可以匹配，这样，不会因为 "|" 两边的表达式因为交换位置而有所不同。</p>
<img src ="http://www.cnblogs.com/Alacky/aggbug/1009391.html?type=2" width = "1" height = "1" /><br><br><a href="http://news.cnblogs.com/n/42942/" target="_blank">[新闻]Linux内核2.6.27正式到来</a><br/><a href="http://www.cnblogs.com" target="_blank">博客园首页</a>&nbsp;<a href="http://space.cnblogs.com" target="_blank">社区</a>&nbsp;<a href="http://news.cnblogs.com" target="_blank">新闻频道</a>&nbsp;<a href="http://space.cnblogs.com/group.htm" target="_blank">小组</a>&nbsp;<a href="http://space.cnblogs.com/q" target="_blank">博问</a>&nbsp;<a href="http://wz.cnblogs.com/" target="_blank">网摘</a>&nbsp;<a href="http://space.cnblogs.com/ing" target="_blank">闪存</a>]]></description></item><item><title>拼凑了一个DBUtility,适用于.Net2.0</title><link>http://www.cnblogs.com/Alacky/archive/2007/12/16/997036.html</link><dc:creator>*Alacky</dc:creator><author>*Alacky</author><pubDate>Sun, 16 Dec 2007 14:38:00 GMT</pubDate><guid>http://www.cnblogs.com/Alacky/archive/2007/12/16/997036.html</guid><wfw:comment>http://www.cnblogs.com/Alacky/comments/997036.html</wfw:comment><comments>http://www.cnblogs.com/Alacky/archive/2007/12/16/997036.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnblogs.com/Alacky/comments/commentRss/997036.html</wfw:commentRss><trackback:ping>http://www.cnblogs.com/Alacky/services/trackbacks/997036.html</trackback:ping><description><![CDATA[摘要: 暂时只有三个Helper,分别是MySqlHelper/OracleHelper/SQLHelper(MSSqlServer).MySQLHelper是取自mysql-connector-net-5.0.8.1的(dev.mysql.com有下载),其余两个取自.NET Pet Shop 4.0,简单将它们放到一个命名空间内了.这里下载SqlHelperCode highlighting prod&nbsp;&nbsp;<a href='http://www.cnblogs.com/Alacky/archive/2007/12/16/997036.html'>阅读全文</a><img src ="http://www.cnblogs.com/Alacky/aggbug/997036.html?type=1" width = "1" height = "1" /><br><br><a href="http://news.cnblogs.com/n/42941/" target="_blank">[新闻]《星际争霸2》一分为三 各种族依次登场</a><br/><a href="http://www.cnblogs.com" target="_blank">博客园首页</a>&nbsp;<a href="http://space.cnblogs.com" target="_blank">社区</a>&nbsp;<a href="http://news.cnblogs.com" target="_blank">新闻频道</a>&nbsp;<a href="http://space.cnblogs.com/group.htm" target="_blank">小组</a>&nbsp;<a href="http://space.cnblogs.com/q" target="_blank">博问</a>&nbsp;<a href="http://wz.cnblogs.com/" target="_blank">网摘</a>&nbsp;<a href="http://space.cnblogs.com/ing" target="_blank">闪存</a>]]></description></item><item><title>ASP.NET AJAX Extensions 学习笔记之Hello World</title><link>http://www.cnblogs.com/Alacky/archive/2007/12/14/995101.html</link><dc:creator>*Alacky</dc:creator><author>*Alacky</author><pubDate>Fri, 14 Dec 2007 07:50:00 GMT</pubDate><guid>http://www.cnblogs.com/Alacky/archive/2007/12/14/995101.html</guid><wfw:comment>http://www.cnblogs.com/Alacky/comments/995101.html</wfw:comment><comments>http://www.cnblogs.com/Alacky/archive/2007/12/14/995101.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnblogs.com/Alacky/comments/commentRss/995101.html</wfw:commentRss><trackback:ping>http://www.cnblogs.com/Alacky/services/trackbacks/995101.html</trackback:ping><description><![CDATA[&nbsp;&nbsp;&nbsp;&nbsp;在新建项目时选择ASP.NET AJAX Enabled Web Site模板.打开Default.aspx,页面中已经添加了一个ScriptManager控件,这个控件在AJAX项目里是必不可少的,它将会在客户页面中写入实现AJAX的JS代码.<br />
&nbsp;&nbsp;&nbsp;&nbsp;从工具箱的AJAX Extensions选项卡中拖一个Update Panel到页面.Update Panel是一个容器控件,用来实现页面局部刷新,普通Web控件放入Update Panel后将不会在触发Submit时刷新整个页面,而只是刷新其所在的Update Panel区域.下来我们做一个GridView版的AJAX Hello Word.<br />
&nbsp;&nbsp;&nbsp;&nbsp;拖一个GridView控件到UpdatePanel内,并绑定一个数据源,设置为启用分页,启用排序.按F5键或在页面单击鼠标右键选择'在浏览器中查看'(我喜欢后者)浏览页面.现在点分页看看.恩?等半天才有反应?那是因为数据比较多,需要些时间传送.那在它传送数据的时候能不能给个提示呢,可以.拖一个UpdateProgress到页面内,设置A<font face="Courier New">ssociatedUpdatePanelID为先前创建的UpdatePanel的ID.UpdateProgress也是一个容器控件,其包含的内容将在所指定的UpdatePanel执行回发时显示,随意输入些提示性的文字.UpdateProgress的DisplayAfter属性标识回发时间超过多少毫秒后显示其所承载的内容,默认是500即0.5秒,所以当回发时间少于0.5秒是将不会显示UpdateProgress内的内容.<br />
</font>&nbsp;&nbsp;&nbsp;&nbsp;感觉很简单吧,其它Web控件亦是如此,拖入UpdatePanel内就可以了.<br />
 <img src ="http://www.cnblogs.com/Alacky/aggbug/995101.html?type=1" width = "1" height = "1" /><br><br><a href="http://news.cnblogs.com/n/42940/" target="_blank">[新闻]IBM花300万美元打造互联网三维虚拟紫禁城</a><br/><a href="http://www.cnblogs.com" target="_blank">博客园首页</a>&nbsp;<a href="http://space.cnblogs.com" target="_blank">社区</a>&nbsp;<a href="http://news.cnblogs.com" target="_blank">新闻频道</a>&nbsp;<a href="http://space.cnblogs.com/group.htm" target="_blank">小组</a>&nbsp;<a href="http://space.cnblogs.com/q" target="_blank">博问</a>&nbsp;<a href="http://wz.cnblogs.com/" target="_blank">网摘</a>&nbsp;<a href="http://space.cnblogs.com/ing" target="_blank">闪存</a>]]></description></item><item><title>ASP.NET AJAX Extensions 学习笔记之下载安装</title><link>http://www.cnblogs.com/Alacky/archive/2007/12/06/985679.html</link><dc:creator>*Alacky</dc:creator><author>*Alacky</author><pubDate>Thu, 06 Dec 2007 14:45:00 GMT</pubDate><guid>http://www.cnblogs.com/Alacky/archive/2007/12/06/985679.html</guid><wfw:comment>http://www.cnblogs.com/Alacky/comments/985679.html</wfw:comment><comments>http://www.cnblogs.com/Alacky/archive/2007/12/06/985679.html#Feedback</comments><slash:comments>1</slash:comments><wfw:commentRss>http://www.cnblogs.com/Alacky/comments/commentRss/985679.html</wfw:commentRss><trackback:ping>http://www.cnblogs.com/Alacky/services/trackbacks/985679.html</trackback:ping><description><![CDATA[本笔记是我学习ASP.NET AJAX的过程,基本上是翻译<a title="ASP.NET AJAX" href="http://asp.net/ajax/documentation/live/" target="_blank">ASP.Net AJAX</a>的内容.本人英语不行(四级都没考的说^_^),功力尚浅,有理解或翻译错误的地方请您多多包涵并指出,我会及时更正.
<hr />
<blockquote>
<h4>安装ASP.NET AJAX</h4>
ASP.NET AJAX只支持.NET 2.0以上平台。.NET 3.5中已集成,不需要安装，.NET 2.0需要安装扩展包。<br />
&nbsp;&nbsp;&nbsp; 我用的是VS.NET 2005即.NET 2.0，首先到ASP.NET网站（上面给出的链接地址）下载ASPAJAXExtSetup.msi，由于已经打包好了，安装只需要一路next就行了，但要注意安装前关闭VS.NET，因为安装过程中需要对它做些配置。安装完成后打开VS.NET，新建一个网站，在选择模板的时候你会发现&#8216;Visual Studio 已安装模板&#8217;中多了一个&#8216;Asp.net AJAX-Enabled Web Site&#8217;，选它，然后设置&#8216;位置&#8217;&#8216;语言&#8217;，点&#8216;确定&#8217;。<br />
&nbsp;&nbsp;&nbsp; 接下来让我们看看安装过后我们得到了什么。<br />
&nbsp;&nbsp;&nbsp; 在默认首页Default.aspx中，多了一个名为ScriptManager的控件，这个控件在使用AJAX时必须加入到页面中，具体作用后面再说。<br />
&nbsp;&nbsp;&nbsp; 再看看工具箱。将页面转为&#8216;设计&#8217;模式，打开&#8216;工具箱&#8217;，把滚动条拉到最下面会发现有个&#8216;AJAX Extensions&#8217;选项卡，里面有五个控件，它们构成了基础ASP.NET AJAX框架。<br />
&nbsp;&nbsp;&nbsp; 现在可以开始做WEB2.0体验的页面了，但只用这五个控件来做是很费劲的，因此MS提供了一个&#8216;AJAX Control Toolkit&#8217;包，其中包含了很多常用的AJAX控件，下来我们就来安装它。<br />
<h4>安装 AJAX Control Toolkit</h4>
&nbsp;&nbsp;&nbsp; ASP.NET上有它的下载，提供两种版本，带源码版和不带源码（二进制）版，我下载的是带源码版（以后深入学习时就不用再下了）。<br />
&nbsp;&nbsp;&nbsp; 将压缩包解压到<font face="Courier New">C:\Program Files\Microsoft ASP.NET\ASP.NET 2.0 AJAX Extensions\AjaxControlToolkit\</font>中,这里就是先前装ASP.NET AJAX的地方.打开AjaxControlToolkit.sln文件,编译'<font face="Courier New">AjaxControlToolkit</font>'项目和'TemplateVSI',编译完成后关闭VS.NET 2005,进入解压目录的TemplateVSI\bin文件夹, 双击AjaxControlExtender.vsi文件,在弹出的对话框里选择要安装的版本(VB\C#或者全选),点'下一步'直到完成。<br />
&nbsp;&nbsp;&nbsp; 好了，到此安装就完成了，现在打开VS.NET 2005，新建网站时&#8216;我的模板&#8217;中会多出个&#8216;AJAX Control Toolkit Web Site&#8217;，选择它，设置好&#8216;位置&#8217;&#8216;语言&#8217;，点&#8216;确定&#8217;。<br />
&nbsp;&nbsp;&nbsp; 打开Default.aspx，你会发现之前的ScriptManager控件变成了ToolkitScriptManager，并且在网站的bin目录中多了一个dll文件和很多文件夹，这些文件夹是为不同语言准备的资源文件。那现在都加了什么新控件呢，其实有很多，但打开&#8216;工具箱&#8217;却并没有发现什么新的东西。这里我们需要手动添加控件到&#8216;工具箱&#8217;。在&#8216;工具箱&#8217;单击右键，选择&#8216;添加选项卡&#8217;，然后输入AJAX Control Toolkit并回车。然后在这个新建的选项卡里单击右键选择&#8216;选择项&#8217;，弹出的对话框里点&#8216;浏览&#8217;，定位到 <font face="Courier New">C:\Program Files\Microsoft ASP.NET\ASP.NET 2.0 AJAX Extensions\AjaxControlToolkit\AjaxControlToolkit\bin\Release</font>，选择<font face="Courier New">AjaxControlToolkit.dll</font>，单击&#8216;打开&#8217;--&#8216;确定&#8217;，我们得到它了，很多的控件，今后的开发可就靠它们了哦～<br />
<br />
</blockquote>
<img src ="http://www.cnblogs.com/Alacky/aggbug/985679.html?type=1" width = "1" height = "1" /><br><br><a href="http://news.cnblogs.com/n/42938/" target="_blank">[新闻]11个处于悬崖边缘的 Web 公司</a><br/><a href="http://www.cnblogs.com" target="_blank">博客园首页</a>&nbsp;<a href="http://space.cnblogs.com" target="_blank">社区</a>&nbsp;<a href="http://news.cnblogs.com" target="_blank">新闻频道</a>&nbsp;<a href="http://space.cnblogs.com/group.htm" target="_blank">小组</a>&nbsp;<a href="http://space.cnblogs.com/q" target="_blank">博问</a>&nbsp;<a href="http://wz.cnblogs.com/" target="_blank">网摘</a>&nbsp;<a href="http://space.cnblogs.com/ing" target="_blank">闪存</a>]]></description></item><item><title>delete之前是否需要if</title><link>http://www.cnblogs.com/Alacky/archive/2007/11/12/957290.html</link><dc:creator>*Alacky</dc:creator><author>*Alacky</author><pubDate>Mon, 12 Nov 2007 14:05:00 GMT</pubDate><guid>http://www.cnblogs.com/Alacky/archive/2007/11/12/957290.html</guid><wfw:comment>http://www.cnblogs.com/Alacky/comments/957290.html</wfw:comment><comments>http://www.cnblogs.com/Alacky/archive/2007/11/12/957290.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnblogs.com/Alacky/comments/commentRss/957290.html</wfw:commentRss><trackback:ping>http://www.cnblogs.com/Alacky/services/trackbacks/957290.html</trackback:ping><description><![CDATA[<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <font face="Courier New"><a href="http://www.parashift.com/c++-faq-lite/freestore-mgmt.html#faq-16.8">http://www.parashift.com/c++-faq-lite/freestore-mgmt.html#faq-16.8</a>这里说不用if判断,但这样真的可以么?</p>
<div style="border-right: #cccccc 1px solid; padding-right: 5px; border-top: #cccccc 1px solid; padding-left: 4px; font-size: 13px; padding-bottom: 4px; border-left: #cccccc 1px solid; width: 98%; word-break: break-all; padding-top: 4px; border-bottom: #cccccc 1px solid; background-color: #eeeeee"><span style="color: #008080">1</span>&nbsp;<span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">char</span><span style="color: #000000">*</span><span style="color: #000000">&nbsp;ptr&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">new</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">char</span><span style="color: #000000">();<br />
</span><span style="color: #008080">2</span>&nbsp;<span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #000000">*</span><span style="color: #000000">ptr&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #000000">'</span><span style="color: #000000">a</span><span style="color: #000000">'</span><span style="color: #000000">;<br />
</span><span style="color: #008080">3</span>&nbsp;<span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;delete&nbsp;ptr;<br />
</span><span style="color: #008080">4</span>&nbsp;<span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #008000">//</span><span style="color: #008000">ptr&nbsp;=&nbsp;NULL;</span><span style="color: #008000"><br />
</span><span style="color: #008080">5</span>&nbsp;<span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;delete&nbsp;ptr;</span></div>
<p></font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 上面的代码在Dev-C++中可以顺利通过编译并运行,但在VC++中会报一个内存错误.这也许就是我们在delete之前加if判断的原因.但加了之后就安全了么?看下面</p>
<div style="border-right: #cccccc 1px solid; padding-right: 5px; border-top: #cccccc 1px solid; padding-left: 4px; font-size: 13px; padding-bottom: 4px; border-left: #cccccc 1px solid; width: 98%; word-break: break-all; padding-top: 4px; border-bottom: #cccccc 1px solid; background-color: #eeeeee"><span style="color: #008080">1</span>&nbsp;<span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">char</span><span style="color: #000000">*</span><span style="color: #000000">&nbsp;ptr&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">new</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">char</span><span style="color: #000000">();<br />
</span><span style="color: #008080">2</span>&nbsp;<span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #000000">*</span><span style="color: #000000">ptr&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #000000">'</span><span style="color: #000000">a</span><span style="color: #000000">'</span><span style="color: #000000">;<br />
</span><span style="color: #008080">3</span>&nbsp;<span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">if</span><span style="color: #000000">(ptr)<br />
</span><span style="color: #008080">4</span>&nbsp;<span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;delete&nbsp;ptr;<br />
</span><span style="color: #008080">5</span>&nbsp;<span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #008000">//</span><span style="color: #008000">ptr&nbsp;=&nbsp;NULL;</span><span style="color: #008000"><br />
</span><span style="color: #008080">6</span>&nbsp;<span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">if</span><span style="color: #000000">(ptr)<br />
</span><span style="color: #008080">7</span>&nbsp;<span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;delete&nbsp;ptr;</span></div>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;在Dev-C++中没报错,VC++还是报错,为什么? 因为delete一个对象之后,这个对象并不是NULL(地址0),delete只是将指定的内存区域清空,但这并不代表NULL,所以那两个if都为真,也就执行了两次delete,如同第一段代码的结果, delete两次在VC++中将导致一个内存错误.<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 在开头给出的网页上有这么一句:The C++ language guarantees that <nobr><tt>delete p</tt></nobr> will do nothing if <tt>p</tt> is equal to <tt>NULL</tt>.(delete p;如果p等于NULL则不做任何事情).这里就明了了,在VC++中对象被delete之后没有被设置为NULL,所以报错.去掉上面代码的注释,在两种编译器中都可以正常编译运行.<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 那要加if吗?随个人喜好了,但千万不要忘了给delete过的对象赋NULL.<br />
<img src ="http://www.cnblogs.com/Alacky/aggbug/957290.html?type=1" width = "1" height = "1" /><br><br><a href="http://news.cnblogs.com/n/42937/" target="_blank">[新闻]扎克博格：Facebook要先赚吆喝后赚钱</a><br/><a href="http://www.cnblogs.com" target="_blank">博客园首页</a>&nbsp;<a href="http://space.cnblogs.com" target="_blank">社区</a>&nbsp;<a href="http://news.cnblogs.com" target="_blank">新闻频道</a>&nbsp;<a href="http://space.cnblogs.com/group.htm" target="_blank">小组</a>&nbsp;<a href="http://space.cnblogs.com/q" target="_blank">博问</a>&nbsp;<a href="http://wz.cnblogs.com/" target="_blank">网摘</a>&nbsp;<a href="http://space.cnblogs.com/ing" target="_blank">闪存</a>]]></description></item><item><title>[转]MySQL 字段类型参考</title><link>http://www.cnblogs.com/Alacky/articles/937826.html</link><dc:creator>*Alacky</dc:creator><author>*Alacky</author><pubDate>Thu, 25 Oct 2007 13:23:00 GMT</pubDate><guid>http://www.cnblogs.com/Alacky/articles/937826.html</guid><wfw:comment>http://www.cnblogs.com/Alacky/comments/937826.html</wfw:comment><comments>http://www.cnblogs.com/Alacky/articles/937826.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnblogs.com/Alacky/comments/commentRss/937826.html</wfw:commentRss><trackback:ping>http://www.cnblogs.com/Alacky/services/trackbacks/937826.html</trackback:ping><description><![CDATA[<font face="Courier New">
<p><br />
字段类型:<br />
TINYINT[(M)] [UNSIGNED] [ZEROFILL] <br />
一个很小的整数。有符号的范围是-128到127，无符号的范围是0到255。 </p>
<p>SMALLINT[(M)] [UNSIGNED] [ZEROFILL] <br />
一个小整数。有符号的范围是-32768到32767，无符号的范围是0到65535。 </p>
<p>MEDIUMINT[(M)] [UNSIGNED] [ZEROFILL] <br />
一个中等大小整数。有符号的范围是-8388608到8388607，无符号的范围是0到16777215。 </p>
<p>INT[(M)] [UNSIGNED] [ZEROFILL] <br />
一个正常大小整数。有符号的范围是-2147483648到2147483647，无符号的范围是0到4294967295。 </p>
<p>INTEGER[(M)] [UNSIGNED] [ZEROFILL] <br />
这是INT的一个同义词。 </p>
<p>BIGINT[(M)] [UNSIGNED] [ZEROFILL] <br />
一个大整数。有符号的范围是-9223372036854775808到9223372036854775807，无符号的范围是0到18446744073709551615。</p>
<p>FLOAT[(M,D)] [ZEROFILL] <br />
一个小(单精密)浮点数字。不能无符号。允许的值是-3.402823466E+38到-1.175494351E-38，0 和 1.175494351E-38到3.402823466E+38。M是显示宽度而D是小数的位数。没有参数的FLOAT或有&lt;24 的一个 参数表示一个单精密浮点数字。 </p>
<p>DOUBLE[(M,D)] [ZEROFILL] <br />
一个正常大小(双精密)浮点数字。不能无符号。允许的值是-1.7976931348623157E+308到-2.2250738585072014E-308、 <br />
0和2.2250738585072014E-308到1.7976931348623157E+308。</p>
<p>DOUBLE PRECISION[(M,D)] [ZEROFILL] <br />
　 <br />
REAL[(M,D)] [ZEROFILL] <br />
这些是DOUBLE同义词。 </p>
<p>DECIMAL[(M[,D])] [ZEROFILL] <br />
一个未压缩(unpack)的浮点数字。不能无符号。行为如同一个CHAR列：&#8220;未压缩&#8221;意味着数字作为一个字符串被存储，值的每一位使用一个字符。</p>
<p>NUMERIC(M,D) [ZEROFILL] <br />
这是DECIMAL的一个同义词。 </p>
<p>DATE <br />
一个日期。支持的范围是'1000-01-01'到'9999-12-31'。MySQL以'YYYY-MM-DD'格式来显示DATE值,但是允许你使用字符串或数字把值赋给DATE列。 </p>
<p>DATETIME <br />
一个日期和时间组合。支持的范围是'1000-01-01 00:00:00'到'9999-12-31 23:59:59'。MySQL以'YYYY-MM-DD HH:MM:SS'格式来显示DATETIME值，但是允许你使用字符串或数字把值赋给DATETIME的列。 </p>
<p>TIMESTAMP[(M)] <br />
一个时间戳记。范围是'1970-01-01 00:00:00'到2037年的某时。MySQL以YYYYMMDDHHMMSS、 YYMMDDHHMMSS、YYYYMMDD或YYMMDD格式来显示TIMESTAMP值，取决于是否M是14（或省略)、12、8或6,但是允许你使用字符串或数字把值赋给TIMESTAMP列。一个TIMESTAMP列对于记录一个INSERT或UPDATE操作的日期和时间是有用的，因为如果你不自己给它赋值，它自动地被设置为最近操作的日期和时间。你以可以通过赋给它一个NULL值设置它为当前的日期和时间。</p>
<p>TIME <br />
一个时间。范围是'-838:59:59'到'838:59:59'。MySQL以'HH:MM:SS'格式来显示TIME值，但是允许你使用字符串或数字把值赋给TIME列。 </p>
<p>YEAR[(2|4)] <br />
一个2或4位数字格式的年(缺省是4位)。允许的值是1901到2155，和0000（4位年格式），如果你使用2位，1970-2069( 70-69)。MySQL以YYYY格式来显示YEAR值，但是允许你把使用字符串或数字值赋给YEAR列。（YEAR类型在MySQL3.22中是新类型。） </p>
<p>CHAR(M) [BINARY] <br />
一个定长字符串，当存储时，总是是用空格填满右边到指定的长度。M的范围是1 ～ 255个字符。当值被检索时，空格尾部被删除。CHAR值根据缺省字符集以大小写不区分的方式排序和比较，除非给出BINARY关键词。NATIONAL CHAR（短形式NCHAR)是ANSI SQL的方式来定义CHAR列应该使用缺省字符集。这是MySQL的缺省。CHAR是CHARACTER的一个缩写。 </p>
<p>[NATIONAL] VARCHAR(M) [BINARY] <br />
一个变长字符串。注意：当值被存储时，尾部的空格被删除(这不同于ANSI SQL规范)。M的范围是1 ～ 255个字符。 VARCHAR值根据缺省字符集以大小写不区分的方式排序和比较，除非给出BINARY关键词值。见7.7.1 隐式列指定变化。 VARCHAR是CHARACTER VARYING一个缩写。 </p>
<p>TINYBLOB <br />
　 <br />
TINYTEXT <br />
一个BLOB或TEXT列，最大长度为255(2^8-1)个字符</p>
<p>BLOB <br />
　 <br />
TEXT <br />
一个BLOB或TEXT列，最大长度为65535(2^16-1)个字符</p>
<p>MEDIUMBLOB <br />
　 <br />
MEDIUMTEXT <br />
一个BLOB或TEXT列，最大长度为16777215(2^24-1)个字符</p>
<p>LONGBLOB <br />
　 <br />
LONGTEXT <br />
一个BLOB或TEXT列，最大长度为4294967295(2^32-1)个字符</p>
<p>ENUM('value1','value2',...) <br />
枚举。一个仅有一个值的字符串对象，这个值式选自与值列表'value1'、'value2', ...,或NULL。一个ENUM最多能有65535不同的值。 </p>
<p>SET('value1','value2',...) <br />
一个集合。能有零个或多个值的一个字符串对象，其中每一个必须从值列表'value1', 'value2', ...选出。一个SET最多能有64个成员。 <br />
</font></p>
<br />
<hr />
<h3><span class="sfont">&nbsp;数字类型</span></h3>
<table class="p4" width="80%" border="1" nosave="#101090">
    <tbody>
        <tr>
            <td><strong>列类型</strong> </td>
            <td><strong>需要的存储量</strong> </td>
        </tr>
        <tr>
            <td><code>TINYINT</code> </td>
            <td>1 字节</td>
        </tr>
        <tr>
            <td><code>SMALLINT</code> </td>
            <td>2 个字节</td>
        </tr>
        <tr>
            <td><code>MEDIUMINT</code> </td>
            <td>3 个字节</td>
        </tr>
        <tr>
            <td><code>INT</code> </td>
            <td>4 个字节</td>
        </tr>
        <tr>
            <td><code>INTEGER</code> </td>
            <td>4 个字节</td>
        </tr>
        <tr>
            <td><code>BIGINT</code> </td>
            <td>8 个字节</td>
        </tr>
        <tr>
            <td><code>FLOAT(X)</code> </td>
            <td>4 如果 X &lt; = 24 或 8 如果 25 &lt; = X &lt; = 53</td>
        </tr>
        <tr>
            <td><code>FLOAT</code> </td>
            <td>4 个字节</td>
        </tr>
        <tr>
            <td><code>DOUBLE</code> </td>
            <td>8 个字节</td>
        </tr>
        <tr>
            <td><code>DOUBLE PRECISION</code> </td>
            <td>8 个字节</td>
        </tr>
        <tr>
            <td><code>REAL</code> </td>
            <td>8 个字节</td>
        </tr>
        <tr>
            <td><code>DECIMAL(M,D)</code> </td>
            <td><code>M</code>字节(<code>D</code>+2 , 如果<code>M &lt; D</code>) </td>
        </tr>
        <tr>
            <td><code>NUMERIC(M,D)</code> </td>
            <td><code>M</code>字节(<code>D</code>+2 , 如果<code>M &lt; D</code>) </td>
        </tr>
    </tbody>
</table>
<h3><span class="sfont">&nbsp;日期和时间类型</span></h3>
<table class="p4" width="80%" border="1" nosave="#101090">
    <tbody>
        <tr>
            <td><strong>列类型</strong> </td>
            <td><strong>需要的存储量</strong> </td>
        </tr>
        <tr>
            <td><code>DATE</code> </td>
            <td>3 个字节</td>
        </tr>
        <tr>
            <td><code>DATETIME</code> </td>
            <td>8 个字节</td>
        </tr>
        <tr>
            <td><code>TIMESTAMP</code> </td>
            <td>4 个字节</td>
        </tr>
        <tr>
            <td><code>TIME</code> </td>
            <td>3 个字节</td>
        </tr>
        <tr>
            <td><code>YEAR</code> </td>
            <td>1 字节</td>
        </tr>
    </tbody>
</table>
<h3><span class="sfont">&nbsp;串类型</span></h3>
<table class="p4" width="80%" border="1" nosave="#101090">
    <tbody>
        <tr>
            <td><strong>列类型</strong> </td>
            <td><strong>需要的存储量</strong> </td>
        </tr>
        <tr>
            <td><code>CHAR(M)</code> </td>
            <td><code>M</code>字节，<code>1 &lt;= M &lt;= 255</code> </td>
        </tr>
        <tr>
            <td><code>VARCHAR(M)</code> </td>
            <td><code>L</code>+1 字节, 在此<code>L &lt;= M</code>和<code>1 &lt;= M &lt;= 255</code> </td>
        </tr>
        <tr>
            <td><code>TINYBLOB</code>, <code>TINYTEXT</code> </td>
            <td><code>L</code>+1 字节, 在此<code>L</code>&lt; 2 ^ 8</td>
        </tr>
        <tr>
            <td><code>BLOB</code>, <code>TEXT</code> </td>
            <td><code>L</code>+2 字节, 在此<code>L</code>&lt; 2 ^ 16</td>
        </tr>
        <tr>
            <td><code>MEDIUMBLOB</code>, <code>MEDIUMTEXT</code> </td>
            <td><code>L</code>+3 字节, 在此<code>L</code>&lt; 2 ^ 24</td>
        </tr>
        <tr>
            <td><code>LONGBLOB</code>, <code>LONGTEXT</code> </td>
            <td><code>L</code>+4 字节, 在此<code>L</code>&lt; 2 ^ 32</td>
        </tr>
        <tr>
            <td><code>ENUM('value1','value2',...)</code> </td>
            <td>1 或 2 个字节, 取决于枚举值的数目(最大值65535）</td>
        </tr>
        <tr>
            <td><code>SET('value1','value2',...)</code> </td>
            <td>1，2，3，4或8个字节, 取决于集合成员的数量(最多64个成员）</td>
        </tr>
    </tbody>
</table>
<img src ="http://www.cnblogs.com/Alacky/aggbug/937826.html?type=2" width = "1" height = "1" /><br><br><a href="http://news.cnblogs.com/n/42936/" target="_blank">[新闻]金融风暴改写富豪榜排名　巴菲特资产超盖茨</a><br/><a href="http://www.cnblogs.com" target="_blank">博客园首页</a>&nbsp;<a href="http://space.cnblogs.com" target="_blank">社区</a>&nbsp;<a href="http://news.cnblogs.com" target="_blank">新闻频道</a>&nbsp;<a href="http://space.cnblogs.com/group.htm" target="_blank">小组</a>&nbsp;<a href="http://space.cnblogs.com/q" target="_blank">博问</a>&nbsp;<a href="http://wz.cnblogs.com/" target="_blank">网摘</a>&nbsp;<a href="http://space.cnblogs.com/ing" target="_blank">闪存</a>]]></description></item><item><title>JS中创建UTF-8编码文本文件</title><link>http://www.cnblogs.com/Alacky/archive/2007/10/19/929743.html</link><dc:creator>*Alacky</dc:creator><author>*Alacky</author><pubDate>Thu, 18 Oct 2007 16:12:00 GMT</pubDate><guid>http://www.cnblogs.com/Alacky/archive/2007/10/19/929743.html</guid><wfw:comment>http://www.cnblogs.com/Alacky/comments/929743.html</wfw:comment><comments>http://www.cnblogs.com/Alacky/archive/2007/10/19/929743.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnblogs.com/Alacky/comments/commentRss/929743.html</wfw:commentRss><trackback:ping>http://www.cnblogs.com/Alacky/services/trackbacks/929743.html</trackback:ping><description><![CDATA[<font face="Verdana">&nbsp;&nbsp;&nbsp;&nbsp;本来是用FSO操作文件的，但它不支持创建UTF-8编码的文件，只能读取，所以创建UTF-8编码的文件只有依靠ADODB.STREAM<br />
<br />
<div style="border-right: #cccccc 1px solid; padding-right: 5px; border-top: #cccccc 1px solid; padding-left: 4px; font-size: 13px; padding-bottom: 4px; border-left: #cccccc 1px solid; width: 98%; word-break: break-all; padding-top: 4px; border-bottom: #cccccc 1px solid; background-color: #eeeeee"><span style="color: #008080">&nbsp;1</span>&nbsp;<span style="color: #0000ff">var</span><span style="color: #000000">&nbsp;ret&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">new</span><span style="color: #000000">&nbsp;String();<br />
</span><span style="color: #008080">&nbsp;2</span>&nbsp;<span style="color: #0000ff">try</span><span style="color: #000000"><br />
</span><span style="color: #008080">&nbsp;3</span>&nbsp;<span style="color: #000000">{<br />
</span><span style="color: #008080">&nbsp;4</span>&nbsp;<span style="color: #0000ff">var</span><span style="color: #000000">&nbsp;fs&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">new</span><span style="color: #000000">&nbsp;ActiveXObject(</span><span style="color: #000000">"</span><span style="color: #000000">Adodb.Stream</span><span style="color: #000000">"</span><span style="color: #000000">);<br />
</span><span style="color: #008080">&nbsp;5</span>&nbsp;<span style="color: #000000">fs.Charset&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #000000">"</span><span style="color: #000000">utf-8</span><span style="color: #000000">"</span><span style="color: #000000">;<br />
</span><span style="color: #008080">&nbsp;6</span>&nbsp;<span style="color: #000000">fs.Open();<br />
</span><span style="color: #008080">&nbsp;7</span>&nbsp;<span style="color: #000000">fs.WriteText(</span><span style="color: #000000">"</span><span style="color: #000000">文本测试</span><span style="color: #000000">"</span><span style="color: #000000">);<br />
</span><span style="color: #008080">&nbsp;8</span>&nbsp;<span style="color: #000000">fs.SaveToFile(</span><span style="color: #000000">"</span><span style="color: #000000">E:\\jsFile.html</span><span style="color: #000000">"</span><span style="color: #000000">,&nbsp;</span><span style="color: #000000">2</span><span style="color: #000000">);&nbsp;</span><span style="color: #008000">//</span><span style="color: #008000">&nbsp;这里的2表示覆盖模式</span><span style="color: #008000"><br />
</span><span style="color: #008080">&nbsp;9</span>&nbsp;<span style="color: #000000">fs.Close();<br />
</span><span style="color: #008080">10</span>&nbsp;<span style="color: #000000">}<br />
</span><span style="color: #008080">11</span>&nbsp;<span style="color: #0000ff">catch</span><span style="color: #000000">(e)<br />
</span><span style="color: #008080">12</span>&nbsp;<span style="color: #000000">{<br />
</span><span style="color: #008080">13</span>&nbsp;<span style="color: #000000">ret&nbsp;</span><span style="color: #000000">+=</span><span style="color: #000000">&nbsp;</span><span style="color: #000000">"</span><span style="color: #000000">Err:\n</span><span style="color: #000000">"</span><span style="color: #000000">&nbsp;</span><span style="color: #000000">+</span><span style="color: #000000">&nbsp;e.number&nbsp;</span><span style="color: #000000">+</span><span style="color: #000000">&nbsp;</span><span style="color: #000000">"</span><span style="color: #000000">&nbsp;&nbsp;Des:</span><span style="color: #000000">"</span><span style="color: #000000">&nbsp;</span><span style="color: #000000">+</span><span style="color: #000000">&nbsp;e.description;<br />
</span><span style="color: #008080">14</span>&nbsp;<span style="color: #000000">}<br />
</span><span style="color: #008080">15</span>&nbsp;<span style="color: #000000">ret;</span></div>
</font>
<img src ="http://www.cnblogs.com/Alacky/aggbug/929743.html?type=1" width = "1" height = "1" /><br><br><a href="http://news.cnblogs.com/n/42935/" target="_blank">[新闻]红杉资本发出严重警告：黄金时代已成历史</a><br/><a href="http://www.cnblogs.com" target="_blank">博客园首页</a>&nbsp;<a href="http://space.cnblogs.com" target="_blank">社区</a>&nbsp;<a href="http://news.cnblogs.com" target="_blank">新闻频道</a>&nbsp;<a href="http://space.cnblogs.com/group.htm" target="_blank">小组</a>&nbsp;<a href="http://space.cnblogs.com/q" target="_blank">博问</a>&nbsp;<a href="http://wz.cnblogs.com/" target="_blank">网摘</a>&nbsp;<a href="http://space.cnblogs.com/ing" target="_blank">闪存</a>]]></description></item></channel></rss>