Chapter 8 Document Management(第8章 文档管理)—2 【中英文对照】

 

Text Representation

文本表示

Any text that we work with has to be represented in some manner in the memory, before it is edited. There are only a few alternative representations used in editors, which are divided into two categories:

任何用来工作的文档在编辑之前,都必须以某种格式在内存中表示。与编辑器中的格式几乎完全不同,这是两个不同范畴中的表示。

Basic sequence data structures

基本顺序数据结构

These are the simplest possible data structures for representing text.

这可能是表示文本最简单的数据结构。

Composite sequence data structures

合成顺序数据结构

These are made from compositely nested or, more commonly, linked basic data structures. The terminology used here is introduced in C.S. Crowley's paper on data structures for text representation (refer to http://www.cs.unm.edu/~crowley/papers/sds/sds.html) – who, however, thinks that the composite data structures should be termed recursive. This is, of course, the more flexible approach. Yet for the time being, we have decided to go with a type of the basic sequence data structure named the gap buffer.

更普遍的是被嵌套合成或链接成基本的数据结构。这里引用C.S. Crowley的论文——“文本表示的数据结构”(data structures for text representation)中的术语表述(参见:http://www.cs.unm.edu/~crowley/papers/sds/sds.html)。然而,复杂的数据结构应该是递归表示的。当然,这是比较有灵活的做法。眼下,我们决定使用一个基本数序数据结构,被称为间隙缓冲器(“gap buffer”)

Let's look at basic sequence data structures in detail.

让我们来详细研究基本顺序数据结构。

Basic Sequence Data Structures

基本顺序数据结构

To justify our choice of the gap buffer structure for the editor, first we will discuss all basic data structures used in editors and then discuss their pros and cons.

为了证明我们给编辑器选择间隙缓冲数据结构的正确性,首先让我们讨论用于编辑器的所有基本数据结构,然后再讨论他们各自的优缺点。

As we discussed in Chapter 2, choosing the appropriate data structure for text representation in the editor was a major issue in the history of SharpDevelop. First, we will look at the basic data structures that we use now.

就像我们第2章所讨论的,在SharpDevelop发展的历史上,为编辑器的文本表述选择适当的数据结构曾是一个很重要的话题。首先,我们将会看我们现在使用的基本的数据结构。

Arrays

数组

The simplest data structure capable of representing an editable text is an array or a string, which looks at memory as a single contiguous block containing the text to edit.

能表示可编辑文本的最简单的数据结构是数组或字符串,他们在内存中的表示像一个连续的包含可编辑文本的块。

Obviously, this approach is not good from a performance point of view, as when we insert text, we increase the size of the buffer. If we consider a text buffer for a typical source code file with, for example, 500 to 1,000 lines and a maximum line length of 80 characters in one huge string, inserting or deleting even a single character requires moving every character to the right of the insertion point by one. If this happens at a point close to the beginning of the string, we have to move almost the entire string. If we type in a completely new sequence at this insertion point, this has to be repeated for every character typed.

显然,从性能观点考虑这种方法是不好的,例如在我们插入文本时,就要增加缓冲器的尺寸。如果用一个文本缓冲器表示巨型字符串,该字符串表示一个5001000行、每行最多80个字符的典型的源代码文件,插入或删除一个字符时需要逐个移动字符到正确的插入点。如果这个点在靠近字符串开始的位置,我们不得不移动几乎整个串。如果我们在插入点打入一个全新的序列,我们不得不因为任何一个字符的输入而重复以上过程。

Replacement might be a different matter though, as the string does not need to move. However, this would require a clever bit of code that tries to figure out whether we are first deleting and then inserting, or first inserting and then deleting – both of which are equivalent to a replace operation. Also, replacing a highlighted character would be a simple replace operation without any shifting.

尽管替换可能是一件不同的事情,字符串不需要移动。然而这要求有一些精明强干的代码去指明是先删除再插入还是先插入再删除——两者均等价与一个替换操作。同时,替换一个突出的特征(highlighted character)将与一个替换操作没有任何不同。

You can experiment with this approach yourself with the following implementation that has been included for debugging and demonstration purposes in BufferStrategy.cs

src"SharpDevelop"ICSharpCode.TextEditor"Document"TextBufferStrategy"StringTextBufferStrategy.cs

你可以使用下面的实现亲自尝试,文件BufferStrategy.cs已经包含调试和调试用例。

End of P 190

Begin of P191

Code 代码块

End of P191

Begin of P192

Code 代码块

The text in the buffer consists of a single string manipulated using the .NET string functions. Replacement is done by the Insert and Remove methods at the corresponding position, implemented as an independent method instead of just calling the two methods, as these two methods can use different buffer strategies (implementing replacement in different ways). The position at which the manipulation takes place is given using an offset from the string's start.

缓冲区中的文本有一个单独的字符串构成,使用.NET的string函数对其进行操作。替换是通过在相应位置使用插入方法和移除方法来完成,因此这两种方法能够用不同的缓冲区策略(以不同的方式实现替换)。操作发生的位置用一个表示从字符串起始位置开始计数的偏移量(offset)给出。

Although the code is simple, the performance penalty is hidden. One may say that the .NET Framework handles moving characters around and we are not directly concerned with it; however, this isn't exactly true. .NET does not move characters around, as string objects are fixed once they have been created. Each editing action requires a new string object containing the new string to be created. The performance penalty due to this will be extremely high, as object creation uses precious resources. If you dare, you can try this out for yourself.

虽然代码很简单,但性能障碍被隐藏。一方面让 .NET Framework 处理移动,另一方面说我们并不关心它怎样做;然而,这并不是完全真实的。.NET 并不围绕字符移动,因为字符串一旦被创建就是不变的。每个编辑动作都需要创建一个包含新字符串的全新字符串。由于这些,当对象创建使用宝贵资源时,性能障碍会非常之高。如果你愿意,你可以自己试试。


 

Linked Lists

链表

Another of the basic sequence data structures is a linked list. This approach assigns an individual block to each character in the edit buffer and then links these blocks. Insert and delete operations, of course, are easy to implement with this approach as we just need to adjust links, and in the case of insert, assign a value to the new block.

另一个基本顺序数据结构是链表。这种方法在编辑缓冲区中为每一个字符分配特殊标记块(定义一个键值),然后链接这些块。当然用这种方法实现插入和删除操作非常容易。就插入来说,指派一个数值到这个新的块即可。

Replacing is simpler still, done by just assigning a new value to the block. If this approach makes handling an edit buffer so easy, then why did we decide against using it in SharpDevelop? The key here is memory requirements, as on one hand, we need a lot of information about the sequence of characters in the buffer, and on the other, the buffer memory will become increasingly fragmented. This will degrade performance, as iterating to a given point takes increasingly longer times and the 'subjective speed' a user experiences is what makes an editor pleasant to work with.

替换也同样简单,仅通过分配一个新的值到块即可。既然这个方法使得操作编辑缓冲器如此容易,那么,我们为什么反对在SharpDevelop中使用它呢?一方面是键值要分配内存,我们需要关于缓冲区中字符顺序的大量信息;另一方面,缓存将逐渐变为碎片。遍历到某个点的时间越来越长,这会降低性能,并且“主观高速”的用户体验使编辑工作者的感觉愉快。

The Gap Buffer Approach

间隙缓冲方法

For SharpDevelop, we decided to settle on the gap-buffer approach, as it is an efficient tradeoff between an array and a linked list. A gap buffer is a data structure where any two stretches of contiguous text in an array are separated by an empty stretch of array invisible to the user. This representation is also known as the buffer-gap, split-block, or two-span approach.

对于SharpDevelop,我们决定使用间隙缓冲方法,因为它是在数组和链表之间的高效的折中方案(权衡)。间隙缓冲器是一种数据结构,在数组中两个相邻的数据块之间存在一个对用户不可见的空数据块。这种表示就是众所周知的缓冲间隙,分割块,或二区间实现。

End of P 192


Begin of P 193

Theory of the Gap Buffer

间隙缓冲器的理论

The following figure illustrates the layout of a gap buffer containing the Formatting text with the caret sitting between the second t and the i:

下面的图例说明包含格式化文本的间隙缓冲器的设计,在第二个字符‘t和字符‘i’之间包含间隙:

The numbers above the boxes are the user co-ordinates, as they represent the user's view of the edit buffer, and the ones below are the gap co-ordinates used internally in handling the buffer. Both co-ordinate systems refer to the locations between the characters in the buffer. The user and the caret see the user co-ordinates only, which are used in our implementation independent of the internal state of the gap.

在方格上面的数字是用户坐标,它们表现编辑缓冲区的用户视图;方格下面的数字是操作缓冲器内部使用的间隙坐标。两个系统均涉及缓冲区中的字符位置。用户和标记只看用户坐标,在我们的实现中它不受间隙的内在状态影响。

Note that when an editing action occurs the gap is always to the left of the caret, but editing is done at the left end of the gap. Let's assume that we want to delete the t. The result is shown in the figure:

注意编辑行为总是发生在标记的左侧,但编辑只能在间隙结束位置的左侧进行。假设我们要删除字符t。结果如下面图例所示:

The gap has grown as a result of our deletion, with the total buffer size constant in memory. Both minimum and maximum sizes of the gap are usually given based on programmer's decisions. Only when the gap needs to be 'regrown' from minimum do we need to allocate additional memory and thus, increase the buffer size.

删除任务的结果是内存中总的缓冲区的大小保持不变,间隙长度增加。间隙的最大值和最小值通常有程序设计者决定。仅当间隙达到最小值而需要增长时,我们需要分配另外的内存以增加缓冲区的大小。

Next, we will try inserting characters, resulting in the string Formatstring being displayed:

接下来,我们试着插入字符,结果如下面的字符处Formatstring所示:

End of P 193

Begin of P 194

Now the gap has shrunk. A gap-buffer implementation always checks for the minimum and maximum sizes the gap can have and adjusts it accordingly if it violates the size constraint, while inserting more text than fits into the gap or deleting spans of text longer than the maximum gap size. The total buffer size will change in this case. As the minimum and maximum sizes of the gap are based on the programmer's decisions about the buffer's purpose and its expected size, fixing it is a matter of experience, so we will not give recommendations.

现在,间隙已经缩小。间隙缓冲的实现经常检查间隙使它处于最大值和最小值之间,如果它超出约束大小就调整它。如插入比间隙还多的文本,或删除文本之后使文本的跨度超出间隙最大值。这种情况下,总缓冲区的尺寸将调整。由于间隙的最大值和最小值由程序设计者根据缓冲的意图和它预期的大小进行设定,对它的设定关乎经验,因此,我们将不提供建议。

One of the advantages of the gap-buffer representation is that the gap needs to be moved only when editing takes place in a position different from the current gap position. This means that caret movements from line to line or paging up and down can be performed without updating the internal representation of the edit buffer, since moving the buffer every time the caret position changes would be too expensive. This is why we use different co-ordinate systems internally and externally.

使用间隙缓冲表示法的其中一个优点就是只有当编辑发生在不同于当前间隙位置时间隙才需要被移动。这意味着标记符从一行移到另一行或一页移到另一页,标记符移动并没有更新编辑缓冲的内部表示。由于每次移动缓冲标记位置都是非常昂贵的。这就是我们为什么使用内外两套不同的坐标系统的原因。

The Gap Buffer in Practice

间隙缓冲器练习

The implementation of the gap buffer in SharpDevelop can be found in the following file.

SharpDevelop 中,间隙缓冲器的实现能在下面的文件中找到。

(src"SharpDevelop"ICSharpCode.TextEditor"Document"TextBufferStrategy"GapTextBuf ferStrategy.cs ):

Code 代码块

End of P 194

Begin of P 195

Code 代码块

Initialization of the buffer is simple. We assign space for the text to be buffered plus the size of the gap. In this case, the size decisions are based on experience. The Insert and Remove routines are treated as special cases of replace operations – replacing 'nothing' when inserting and replacing the text with 'nothing' when deleting.

缓冲器的初始化很简单。为缓冲区分配空间,大小为:需要被缓冲的文本加间隙大小。在这种情况下,大小的决策主要基于经验。插入和移除历程被看作替换操作的特殊情况——插入意味着只有插入操作而没有替换、删除意味着把文本替换为没有。

End of P 195

Begin of P 196

Things get interesting as soon as we get to the Replace part, which is not as simple as first deleting and then inserting, as in the StringTextBuffer code:

有趣的事情是我们很快就会看到替换的代码,它不想删除和插入那么简单。在StringTextBuffer代码中定义:

 

Code 代码块

 

After checking for an empty buffer, we place the gap where the edit action is to take place. Then we work with the gap and text.copy, first copying the text into the gap, and then resizing the gap accordingly. We need to take into account the limitations on gap size to readjust it when necessary. This is done with the ifstatement at the end of the routine.

在检查一个空缓冲之后,我们把间隙放置在编辑将要发生的位置。接着,我们使用间隙进行文本拷贝,首先把文本拷入间隙,然后适当地调整间隙大小。必要时,我们要考虑按照间隙大小的限制对其进行调整。通常在例程的最后,要使用IF语句经行判定。

The movement of the buffer according to where it is supposed to go (before or after the current position), and whether the gap has to be resized, is the interesting part. All of this happens in the PlaceGap routine, which we called in the above code:

缓冲器根据对位置的推测进行移动(当前位置之前还是之后),是否对间隙大小进行调整,是很有趣的事情。所有这些全部在PlaceGap例程中发生,因此我们收回上面的代码。

Code 代码块

End of P 196 Begin of P 197

Code 代码块

Dynamic resizing of arrays is handled by our code for the gap-buffer text representation. By using the Array.Copy method, performance is enhanced, when compared to moving buffer elements ourselves, as hand written movement of the buffer elements would incur a much higher number of memory operations at the lower level, with the corresponding overheads. This follows from a simple rule that every programmer should keep in mind – don't reinvent the wheel when it has been done and works well.

阵列(arrays)动态调整大小 由间隙缓冲文本表示方法处理。通过使用Array.Copy方法,性能被提高。在成本相同的情况下,比较移动缓冲元件自己与手工编写的缓冲移动元件,后者将导致更为频繁的底层内存操作。每位程序员都应该时刻牢记,这遵从一个简单的原则——不要重新发明轮子,当它已经完成并工作良好时。文本表示法的接口在下面的文件中定义。(src"SharpDevelop"ICSharpCode.TextEditor"Document"TextBufferStrategy"ITextBuffe rStrategy.cs.)

 Here is a short excerpt to give you an idea of this:

这里的简短摘录仅仅给你一个思路:

Code 代码块

It is a good idea to name the pattern used in an interface so that the developers using this interface know what to expect. The implementation of the methods in the text buffer follow the Strategy pattern discussed in detail in Chapter 2.

使用一个接口的方式命名是一个好主意,以便于使用这一接口的开发者知道该期待什么。已经在第二章的策略模式之后详细讨论了文本缓冲器中方法的实现。

The Future – The Piece Table

面向未来—区块表

For the time being, this representation works well; however, it might become necessary to switch to a different text representation in the future. After discussing the various basic data sequences, we have the composite sequence data structures left to evaluate.

暂时,这种表示能工作的很好;然而,将来他可能必须转变成不同的文本表示方法。在讨论了各种不同的基本数据顺序结构之后,还有复合顺序数据结构等待我们去探讨。

End of P 197

Begin of P 198

Composite Sequence Data Structures

复合顺序数据结构

There are three types of composite sequence data structures:

The line span data structure

Fixed size buffers

The piece table

Now let's look at each of them.

有三种复合顺序数据结构:

线间隔数据结构

固定尺寸缓冲器

区块表

好,然我们看看他们。

Line Span Method

线间隔数据结构

The line span data structure consists of a buffer holding a description of the blocks containing individual lines, which makes displaying the buffer content quite easy. However, editing operations can be cumbersome to handle, as operations concerning lines and characters in lines have to be handled separately. Avoiding problems with performance and memory with this approach requires extra implementation effort.

线间隔数据结构包含一个缓冲,缓冲处理包含描述各行信息的数据块,这使得显示缓冲区的内容非常容易。然而,处理编辑操作是麻烦的,因为行和字符不得不分别进行处理。用这种方法避免性能和内存方面的问题需要额外的实现的努力。

Fixed Size Buffers

固定尺寸缓冲器

The use of fixed size buffers avoids some of these problems, but it is inefficient in terms of memory usage, as most of the individual buffers (usually corresponding to lines of fixed length) are rarely fully used. Fortunately, fixed size buffers are more or less an obsolete approach, as dynamic memory management is no longer a problem in modern programming languages.

固定大小缓冲器的使用避免一些问题,但内存使用效率低,因为大部分拥有固定长度的缓冲器并没有被完全使用。幸运的是,固定大小缓冲器已或多或少的成为过时的方法,因为动态内存管理在现代编程语言中也不再是个问题。

The Piece Table

区块表

Leaving aside possible new composite data representation structures, the most attractive of the composite approaches seems to be the piece table, which can be seen as a combination of the two approaches given. It has neither fixed size buffers nor buffers for individual lines. Instead, the buffer is split into pieces as needed, with a table giving the relationships between the pieces of the buffer – hence the name 'piece table'.

Leaving aside possible new composite data representation structures, the most attractive of the composite approaches seems to be the piece table, which can be seen as a combination of the two approaches given.||在一边离开可能的新含有种种要素的数据表示构成,含有种种要素方法的最吸引人的似乎是区块表,能被视为被给予的一个这二方法的组合。 It has neither fixed size buffers nor buffers for individual lines.||它有既非固定的按规定尺寸制作缓冲器也不为独用线的缓冲器。 Instead, the buffer is split into pieces as needed, with a table giving the relationships between the pieces of the buffer – hence the name 'piece table'.||相反地,缓冲器当做需要进入块之内被分离, 藉由提供缓冲器的块之间的联系的一个表格 - 名字由此而来 '块表格'.

In this approach, we start with a single contiguous block containing the file read from the disk. While editing, the file is broken up into pieces, with the edited text set up in a new append-only block. Any appending is performed at its end, outside the read-only block, ensuring that the original text can always stay in read-only blocks. The offset and length information regarding the individual blocks is kept in the piece table. The individual blocks stay in place as long as the file is open, as any edit action only appends to the append-only block, turning the formerly active segment into a read-only one. The main advantages a piece table provides are that other components of the software can easily access the buffers, as they always stay in place, and that unlimited undo and redo is easy to implement as all the necessary information resides in the translation table. Access to the buffer is done by looking up the piece table.

Implementing unlimited undo and redo operations directly in the text buffer is not an issue for the time being. We also implemented our own undo functionality, as undoing operations also occur outside the text editor.

End of P 198

Begin of P199

Representing Lines

Text representation by a gap buffer implies that all lines are represented by the contents of two contiguous blocks of data, which we then have to divide into chunks representing lines. This task is handled by the Line Manager. Up to SharpDevelop release 0.88b, a Line Tracker Strategy handled this task. This pattern-based approach was abandoned because it proved to be too inflexible for the demands of the SharpDevelop editor. The Strategy pattern is discussed in Chapter 2.

Dividing the contents of the edit buffer into lines serves three purposes – housekeeping of lines, syntax highlighting, and folding administration (keeping a record of which lines are visible and which ones have been 'folded away' from sight). Syntax highlighting has been presented in Chapter 1 and will be discussed in Chapter 9. We will take a look at folding in Chapter 10 and Chapter 11. Beyond these tasks, events are raised when the number of lines changes, as it is important for the proper functioning of some other SharpDevelop features, such as bookmarking. This plethora of functions served by the Line Manager obviously goes far beyond the goals of a single Strategy pattern, as each function of the Line Manager might be represented by a strategy.

The task of breaking the edit buffer into discrete lines and keeping track of them is accomplished by the use of collections. The Line Manager is a complex piece of software, as we will see when we will look at the interfaces defined in the src"SharpDevelop"ICSharpCode.TextEditor"Document"LineManager"ILineManager.cs file:

Code 代码块

End of P199

Begin of P200

Code 代码块

We can see that there are interfaces for all the functionalities, like administration tasks, handling

line-related events, and referring to the lines in the edit buffer. We can refer to the text either by offset

or by line number. Furthermore, the terms logical line and visible line need to be explained. A logical

line exists in the edit buffer but can be invisible. Visible lines are a subset of logical lines, as we need to

distinguish between the total number of lines and the number of lines visible in a buffer for handling

bookmarks and other features.

We will see in Chapter 9, that there is a Strategy pattern – HighlightingStrategy. As far as syntax

highlighting goes, performing the actual highlighting is not the task of the Line Manager. It just informs

the highlighting routines about which lines need to be updated.

Also, we find that there are functions for converting offset co-ordinates to lines or line segments. Offset

co-ordinates are more natural for buffer-related operations, whereas the line co-ordinate system is better

suited for preparing lines for display. The actual work of converting the edit buffer into lines is done in the

src"SharpDevelop"ICSharpCode.TextEditor"Document"LineManager"LineSegment.cs

file.

This class has a number of functions that go beyond the conversion from buffer to line, such as

syntax highlighting:

Code 代码块

End of P200


Begin of P201

Code 代码块

End of P201

Begin of P202

Code 代码块

This is mostly self-explanatory. The use of three variables to handle line length calls for some attention

Length, DelimiterLength, and TotalLength. Most of the code that uses the functionality

supplied by the Line Manager references Length, which is the length of the line excluding the

delimiting characters. However, for the work done inside the Line Manager, the total length of the lines

must be known, as delimiters are also counted as characters in the offset view of the buffer. Now you

may think a delimiter is a simple newline character, this may not necessarily be true.

In SharpDevelop, we are free to choose between Unix-style, DOS-style, and Macintosh-style line-end

delimiters, which means that cr, lf, or both may be used, and each corresponds to one character in the

buffer. Therefore, we need to take this variation of delimiter length into account, especially in the light

of the fact that the user can choose which delimiters to use, and that SharpDevelop should handle all of

these delimiter types, even when the user opens files imported from operating systems using delimiters

that are not DOS-style. This happens while developing web-based applications (ASP.NET web services)

and is the first small step in preparing SharpDevelop to port to other .NET-compatible platforms.

Another interesting element is FoldLevel, as SharpDevelop allows nested folding.

The one bit of code that you will have noticed as being a bit unusual is:

Code 代码块

This overload of the ToString method is used for writing the output to the console window for easy

debugging, as SharpDevelop does not include a debugger, yet. This is not exactly elegant, but gets the

job done. You will find such code in many places in SharpDevelop. Now you know what it is for.

End of P202

Begin of P203

For syntax highlighting, not only should the lines be provided but they must also be broken into

segments corresponding to individual syntactical elements, for correct highlighting. We will discuss

syntax highlighting in Chapter 9.

After the actual work of breaking the buffer into lines and line segments is done by LineSegment.cs,

the LineSegmentCollection.cs file generates a collection of files. It can be found in the

src"SharpDevelop"ICSharpCode.TextEditor"Document"LineManager directory.

This LineSegmentCollection.cs file is generated automatically by using the SharpDevelop's Typed

C# Collection Wizard found under File | New | File | C#, as one of the available file types.

The resulting code looks like this excerpt:

Code 代码块

We decided to use collections here and in the management of selections for two reasons – first, we can

avoid casts, thus acquiring performance hits and making the code easier to read, secondly, collections

are also checked at compile time to avoid errors due to adding inappropriate data. This collection is

used by the DefaultLineManager and few other pieces of code (mostly in highlighting functions).

End of P203

Begin of P204

There is a minor quirk in this particular collection. We have an entry for every line, ending in a line

delimiter, including the last line of the edit buffer. The quirk is that after a line end, a newline must

follow, giving us an extra empty entry. If we forget to take this into consideration, we may have a

problem in our new code. Therefore, it is strongly advised that accessing this collection should always

go through the routines provided by the DefaultLineManager. These routines know about the empty

entry and take care of it.

The file containing the code for the DefaultLineManager is located in

src"SharpDevelop"ICSharpCode.TextEditor"Document"LineManager"

DefaultLineManager.cs. Here's a snippet from this file:

Code 代码块

End of P204

Begin of P205

The correct number of lines used is obtained by testing for the existence of that mysterious empty last

line, as seen in the two methods listed.

We can also see that exception handling is important for staying within the buffer limits. In this listing,

we just see the exception throwing code and the catching is handled the 'SharpDevelop way', which is,

passing the exception up to the highest possible level before handling it. This gives a detailed trace of

the dependencies in the code, so that bug fixes are made easier. Another important aspect of managing

the lines in the buffer is handling events. Events are necessary for correctly handling line-related

functionality outside the text management subsystem, for example, the bookmarks for a file must be

adjusted automatically when the number of lines changes, otherwise they will point to the wrong place.

This is handled by the code in

src"SharpDevelop"ICSharpCode.TextEditor"Document"LineManager"

LineManagerEventArgs.cs:

Code 代码块

End of P205

Begin of P206

The events occur when we either insert a newline (lineStart property), or move one or more lines

(linesMoved property). This code illustrates the standard mechanism for event handling in C#.

Caret and Selection Management

We considered the administration of the buffer from the program's point of view. Now we will look at

how the user's actions, like text and caret selections, are handled.

The caret is displayed as the cursor on screen. Selections are blocks of text marked by the user for some

action to be performed on them. These interactions with the edit buffer are handled in the same way as

the internal housekeeping.

The caret is handled through the interfaces defined in the

src"SharpDevelop"ICSharpCode.TextEditor"Document"Caret"ICaret.cs

file. They are listed below:

Code 代码块

End of P206

Begin of P207

Code 代码块

The caret in SharpDevelop has two modes – insert and overwrite, and a Boolean value for visibility, which

is responsible for the blinking of the caret. Events are raised when the caret moves to a new position.

The structure of the code for handling caret issues is quite similar to the code we looked at when we

discussed line management. (Remember the offsets? They also point to a co-ordinate in the buffer, just

as a caret does.) We will not go into details here to avoid tiresome repetitions.

The two files we need to handle the caret are

src"SharpDevelop"ICSharpCode.TextEditor"Document"Caret"DefaultCaret.cs and

src"SharpDevelop"ICSharpCode.TextEditor"Document"Caret"CaretEventArgs.cs.

Note that we have also overloaded the default ToString method for the same reason as in the Line

Manager, which is debugging.

The handling of selections is done using the interface defined in the

src"SharpDevelop"ICSharpCode.TextEditor"Document"Selection"ISelection.cs file.

The following definitions are contained in this file:

Code 代码块

End of P207

Begin of P208

Code 代码块

The functionality provided is understandable, only the Boolean variable IsRectangularSelection, which is meant for future versions of SharpDevelop, may need some explanation. We have planned to include a feature allowing us to select rectangular screen sections regardless of line lengths, as in terminal emulator programs, which will be useful for quite a few purposes, for example, copying two columns of a table to some other location. However, currently this property isn't used.

The selection(s) is/are handled by a collection, using the code generated by the Typed C# Collection Wizard. As the code is very similar to that for the LineSegmentCollection that we saw earlier, we shall not list it again.

Note that we have included the possibility of selections on purpose, as the future of SharpDevelop might also support a multi-selection feature.

The code for the collection is contained in the src"SharpDevelop"ICSharpCode.TextEditor"Document"Selection"SelectionCollection .cs file.

The implementation of selection handling is to be found in the src"SharpDevelop"ICSharpCode.TextEditor"Document"Selection"DefaultSelection.cs file:

Code 代码块

End of P208

Begin of P209

Code 代码块

We can see that, in this listing, the rectangular selections have been mentioned already, but commented out, for the future implementation of this feature. Also, note that, we have again overloaded the ToString method for debugging purposes. Selections are working on the line segment collection using the LineSegment interface that we discussed earlier in the Representing Lines section.

End of P209

Begin of P210

By now, we have a lot of functionality assembled for handling the document internally. Before we can assemble this, we need a way to hand over the buffer contents to the outside world, for use by display routines. As we will see, we need to abstract the representation of the data for this purpose. This will also facilitate an eventual transition to a piece table representation.

The Text Model

【原文】Before we explain the text model, we need to know a bit about co-ordinate systems. They refer either to positions of characters in a buffer or to the positions between characters in a buffer. Gap co-ordinates and user co-ordinates are based on the indices of the borders between characters and not on the indices of the characters themselves, since a line or selection begins and ends either before or after a character, never in it. Yet both are linear offset co-ordinates and are not fit for immediate display in the X-Y co-ordinate system of a screen. By linear offset, we refer to a co-ordinate system that uses only one co-ordinate. Simply put, we number the characters or the borders between characters, starting with zero at the start ('leftmost') position and count up. In X-Y systems, we address a character by the row and the column at which it is displayed.

【原文】The text model, which we will now discuss, is a mapping of the linear offset co-ordinates onto the

X-Y co-ordinates.

【原文】The interface is defined in the

src"SharpDevelop"ICSharpCode.TextEditor"Document"TextModel"ITextModel.cs file:

Code 代码块

【原文】Note that we deal only with X co-ordinates and since each line is one character high, we can use the line numbers as Y co-ordinates.

【原文】In this interface, we also encounter definitions for dealing with logical and view positions, when handling character-based screen co-ordinates (X-Y co-ordinates). For buffer co-ordinates, we provide offset handling.

End of P210

Begin of P211

【原文】The implementation of the text model, however, is more interesting than we would expect from the interface. The implementation is in the src"SharpDevelop"ICSharpCode.TextEditor"Document"TextModel"DefaultTextModel.cs file:

Code 代码块

End of P211

Begin of P212

Code 代码块

【原文】For successfully converting the document into a character-based co-ordinate system, we need to know about the widths of characters on screen. The most common example of a single character spanning several character widths on screen is the tabulator. This type of 'wide' characters is common in Asian languages.

【原文】The following code snippet, taken out of the above listing, shows us how this problem is solved:

Code 代码块

【原文】Tabs are handled in a similar manner, but need special treatment as they may have different widths depending on the user's preference. Refer to the coding style guide (see Chapter 2) for our stance on tabs as an exclusive means of indenting. Tab width is a property we need to take into account:

Code 代码块

【原文】Now, let's finally assemble all the pieces together.

【原文】Putting It All Together

【原文】Now we are finally done with all the pieces needed to put together a practical document management subsystem. They come together in the DocumentAggregator, which serves as a container for the functionality of the document management subsystem and provides this functionality to external code. It is made up along the usual lines with an interface definition file, a default implementation file, event code, and a factory.

【原文】The interface is defined in src"SharpDevelop"ICSharpCode.TextEditor"Document"IDocumentAggregator.cs.

【原文】For easier understanding, we will split the file up:

为更较容易理解,我们将会分离上面的文件。

End of P212

Begin of P213

Code 代码块

【原文】The other properties defined in this file should look familiar by now. Skipping over a few more property

definitions, we come to:

Code 代码块

End of P213

Begin of P214

Code 代码块

【原文】Most of this code is simple, as we just interface with all the routines providing information about our document by using the DocumentAggregatorFactory, which is an implementation of the factory pattern discussed in Chapter 2 in the src"SharpDevelop"ICSharpCode.TextEditor"Document"DocumentAggregatorFactory.cs file. In this file, we find an interesting bit of code, which provides for building a standalone editor outside SharpDevelop:

Code 代码块

End of P214

Begin of P215

Code 代码块

【原文】The interesting part is contained in the #if #else #endif block. The properties have to be assigned values, which is no problem inside SharpDevelop, as internal defaults are provided for the case when no values are assigned. If no property value is returned by the SharpDevelop options, this default value comes into play. When we use the standalone version, this infrastructure does not exist, as it is part of SharpDevelop, which is why we have to use the #ifdef construct.

【原文】If you want to use the StringTextBufferStrategy for experimenting with the performance of this buffer model relative to the gap buffer model, you need only uncomment the line above GapTextBufferStrategy and comment out the GapTextBufferStrategy line instead.

【原文】Creating the DocumentAggregator for a file is easy:

为一个文件创造 DocumentAggregator 很容易:

Code 代码块

End of P215

Begin of P216

Code 代码块

【原文】That is all there is to it. Now the document is in an internal representation, fit for editing. If you are interested in reading further about the issues involved in writing editors, we recommend that you visit http://www.finseth.com/~fin/craft/. This is the online version of the The Craft of Text Editing book, which is currently out of print.

End of P216


【原文】Summary

【翻译】小结

【原文】In this chapter, we looked at all the components necessary for the internal representation and efficient handling of a document.

【翻译】在这一章,我们研究了表述文档内部结构和高效管理文档所需的所有元件。

【原文】Presently, SharpDevelop uses the gap-buffer approach like most other modern editors, as it can be easily implemented, while a piece table might be an option for future versions of SharpDevelop.

【翻译】目前,SharpDevelop像现代大部分编辑器使用间隙缓冲器。当表格称为SharpDevelop可选的一部分时,同样,它也较易实现。

【原文】We looked at how to break up a text buffer into lines and segments for purposes such as syntax highlighting and line folding, and examined the distinction between logical and visible lines. We also examined the pitfalls inherent in the possible variation of delimiter length and found that typed collections are a good way to handle line segments. Next, we looked at the handling of the caret and selections, which we found similar to handling line segments.

【翻译】我们研究了如何把一个文本缓冲器拆分成目标行或数据段(例如语法高亮、可折叠行),以及检验逻辑行与可视行的区别。我们也检查了定界符长度可能的变化引起的内部陷阱,并且发现分类收集是处理行数据的一个好方法。然后,我们研究了插入标记(caret)和选择的管理,发现对行段(line segments)数据进行管理比较容易。

【原文】Then we concerned ourselves with the transition from the internal model to a model suitable for handling an external system responsible for rendering the buffer contents on screen. We found that we have to adjust co-ordinates and have to take into account the varying width of certain characters when displayed on screen.

【翻译】然后,我们开始关心怎样把内部模式转换成一个适合外部系统应用的模式,该外部系统负责把缓冲区中的内容渲染到屏幕上。我们发现必须调整坐标,并且不得不考虑一些字符在屏幕上显示时的宽度是不同的。

【原文】Lastly, we looked at how we can put these components together so that the actual editor can easily access the data buffer. For this purpose, we introduced the DocumentAggregator, which acts as a container for our components and can be used for building editors outside the SharpDevelop environment.

【翻译】最后,我们研究怎样把这些组件组织在一起,使真实的编辑器能容易地存取数据缓冲器中的数据。为达到这一目的,我们介绍了文档聚合器(DocumentAggregator),它充当组件容器,可以被用于SharpDevelop环境之外建立编辑器。

【原文】Now that we know how to manage text in an editor, we can examine syntax highlighting.

【翻译】既然我们知道了在一个编辑器中如何处理文本,我们就可以检测高亮语法。

…… …… End OfChapter 8…… ……

The setback is individual in the activity that satisfying need, encounter block up and interference, make individual motive cannot come true, a kind of psychology that the individual needs to cannot be satisfied is experienced.

挫折是个体在满足需要的活动中,碰到阻碍和干扰,使个体动机不能实现、个人需要不能满足的一种心理感受。

posted @ 2008-05-21 12:33  逐步高  阅读(543)  评论(1编辑  收藏  举报