烂翻译系列之Rx.NET介绍第二版——组合序列

Data sources are everywhere, and sometimes we need to consume data from more than just a single source. Common examples that have many inputs include: price feeds, sensor networks, news feeds, social media aggregators, file watchers, multi touch surfaces, heart-beating/polling servers, etc. The way we deal with these multiple stimuli is varied too. We may want to consume it all as a deluge of integrated data, or one sequence at a time as sequential data. We could also get it in an orderly fashion, pairing data values from two sources to be processed together, or perhaps just consume the data from the first source that responds to the request.

数据源无处不在,有时我们需要从多个来源获取数据。具有多输入源的典型场景包括:价格信息流、传感器网络、新闻推送、社交媒体聚合器、文件监控器、多点触控表面、心跳/轮询服务器等。处理这些多重输入源的方式同样灵活多样——既可将所有输入视为集成化的数据洪流统一处理,也能按序列逐个解析数据;既可有序配对两个来源的数据值进行联合处理,亦可仅采用最先响应请求的数据源信息。

Earlier chapters have also shown some examples of the fan out and back in style of data processing, where we partition data, and perform processing on each partition to convert high-volume data into lower-volume higher-value events before recombining. This ability to restructure streams greatly enhances the benefits of operator composition. If Rx only enabled us to apply composition as a simple linear processing chain, it would be a good deal less powerful. Being able to pull streams apart gives us much more flexibility. So even when there is a single source of events, we often still need to combine multiple observable streams as part of our processing. Sequence composition enables you to create complex queries across multiple data sources. This unlocks the possibility to write some very powerful yet succinct code.

前几章已展示过“发散与回拢”风格的数据处理示例:先将数据分区,对各分区进行处理以将海量数据转化为更精简但价值更高的事件,最后重新整合。这种对流进行重组的能力,极大提升了运算符组合的优势。若Rx仅支持简单的线性处理链组合,其功能将大幅受限。通过解构数据流,我们获得了更强大的灵活性。因此即便面对单一事件源,处理过程中也常需组合多个可观测流。“序列组合”使开发者能在多个数据源间构建复杂查询,为实现高效而简洁的代码开辟了可能性。

We've already used SelectMany in earlier chapters. This is one of the fundamental operators in Rx. As we saw in the Transformation chapter, it's possible to build several other operators from SelectMany, and its ability to combine streams is part of what makes it powerful. But there are several more specialized combination operators available, which make it easier to solve certain problems than it would be using SelectMany. Also, some operators we've seen before (including TakeUntil and Buffer) have overloads we've not yet explored that can combine multiple sequences.

在前几章中,我们已经使用过 SelectMany 运算符。作为Rx的基础运算符之一,正如“转换”章节所示,许多其他运算符都可通过 SelectMany构建而成,其整合数据流的能力正是其强大之处的一部分。但Rx还提供了其他一些“专门的组合运算符”,能更便捷地解决特定场景下的问题。此外,先前章节涉及的 TakeUntil 、 Buffer等运算符也存在尚未探讨的重载版本,具备跨序列组合能力。

Sequential Combination    顺序组合

We'll start with the simplest kind of combining operators, which do not attempt concurrent combination. They deal with one source sequence at a time.

我们将从最为基础的一类组合运算符入手,这类运算符不涉及并发组合,而是每次仅处理单一源序列。

Concat   连接运算

Concat is arguably the simplest way to combine sequences. It does the same thing as its namesake in other LINQ providers: it concatenates two sequences. The resulting sequence produces all of the elements from the first sequence, followed by all of the elements from the second sequence. The simplest signature for Concat is as follows.

Concat可以说是组合序列的最简方式,其行为与其他LINQ提供程序中的同名运算符一致:将两个序列首尾相连。生成的序列会先完整产出第一个序列的所有元素,随后再生成第二个序列的全部元素。Concat最简单的签名如下:

public static IObservable<TSource> Concat<TSource>(
    this IObservable<TSource> first, 
    IObservable<TSource> second)

Since Concat is an extension method, we can invoke it as a method on any sequence, passing the second sequence in as the only argument:

由于Concat是一个扩展方法,我们可以在任意序列对象上直接以方法形式调用它,并将第二个序列作为唯一参数传入:

IObservable<int> s1 = Observable.Range(0, 3);
IObservable<int> s2 = Observable.Range(5, 5);
IObservable<int> c = s1.Concat(s2);
IDisposable sub = c.Subscribe(Console.WriteLine, x => Console.WriteLine("Error: " + x));

This marble diagram shows the items emerging from the two sources, s1 and s2, and how Concat combines them into the result, c:

下方的弹珠图演示了源序列 s1 与 s1 各自产生的数据项,以及 Concat 运算符如何将它们合并为结果序列 c

Rx's Concat does nothing with its sources until something subscribes to the IObservable<T> it returns. So in this case, when we call Subscribe on c (the source returned by Concat) it will subscribe to its first input, s1, and each time that produces a value, the c observable will emit that same value to its subscriber. If we went on to call sub.Dispose() before s1 completes, Concat would unsubscribe from the first source, and would never subscribe to s2. If s1 were to report an error, c would report that same error to is subscriber, and again, it will never subscribe to s2. Only if s1 completes will the Concat operator subscribe to s2, at which point it will forward any items that second input produces until either the second source completes or fails, or the application unsubscribes from the concatenated observable.

Rx的 Concat 运算符在没有任何订阅者订阅其返回的 IObservable<T> 之前,不会对其源序列进行任何动作。在这个例子中,当我们调用由 Concat 返回的 c (即合并后的可观察序列)的 Subscribe 方法时,它将首先订阅第一个输入源 s1。每当 s1产生一个值时,合并后的 c 可观察序列会将该值传递给它的订阅者。如果我们选择在 s1尚未完成前调用 sub.Dispose() 取消订阅, Concat 会从第一个源 s1取消订阅,且永远不会订阅第二个源 s2。若 s1抛出错误, c 会将同样的错误传递给它的订阅者,这种情况下同样不会订阅 s2。只有当 s1正常完成(Complete)后, Concat 运算符才会订阅 s2,此时它将转发第二个输入源产生的所有数据项,直到第二个源完成或发生错误,或者应用程序主动取消对这个合并后可观察序列的订阅。

Although Rx's Concat has the same logical behaviour as the LINQ to Objects Concat, there are some Rx-specific details to be aware of. In particular, timing is often more significant in Rx than with other LINQ implementations. For example, in Rx we distinguish between hot and cold source. With a cold source it typically doesn't matter exactly when you subscribe, but hot sources are essentially live, so you only get notified of things that happen while you are subscribed. This can mean that hot sources might not be a good fit with Concat The following marble diagram illustrates a scenario in which this produces results that have the potential to surprise:

尽管Rx的Concat运算符与LINQ to Objects中的Concat在逻辑行为上一致,但Rx存在一些特有的细节需要注意。尤其是,在Rx中,时序往往比其他LINQ实现更为关键。例如,Rx中会区分冷源(cold source)热源(hot source)。对于冷源而言,订阅的具体时机通常无关紧要,但热源本质上是实时推送的,因此你只会收到订阅期间发生的事件通知。这意味着,热源可能并不适合与Concat结合使用。下面的弹珠图展示了一个可能令人意外的场景:

Since Concat doesn't subscribe to its second input until the first has finished, it won't see the first couple of items that the hot source would deliver to any subscribers that been listening from the start. This might not be the behaviour you would expect: it certainly doesn't look like this concatenated all of the items from the first sequence with all of the items from the second one. It looks like it missed out A and B from hot.

由于Concat运算符在第一个输入源完成前不会订阅第二个输入源,因此它会错过热源(hot source)在初始阶段向任何"从头开始订阅"的观察者推送的前几个数据项。这种行为可能与你的预期不符:表面上,这似乎并没有将第一个序列的所有数据项与第二个序列的所有数据项完整拼接,反而看起来像是漏掉了热源( hot)中的 A 和 B 。

Marble Diagram Limitations    弹珠图限制

This last example reveals that marble diagrams gloss over a detail: they show when a source starts, when it produces values, and when it finishes, but they ignore the fact that to be able to produce items at all, an observable source needs a subscriber. If nothing subscribes to an IObservable<T>, then it doesn't really produce anything. Concat doesn't subscribe to its second input until the first completes, so arguably instead of the diagram above, it would be more accurate to show this:

最后一个示例揭示了一个常被弹珠图忽略的细节:弹珠图通常只展示源何时启动、何时产生值、何时完成,却忽略了可观察源需要订阅者才能实际生成数据项这一事实。如果没有任何订阅者对IObservable<T>进行订阅,那么它实际上不会产生任何数据。Concat运算符在第一个源完成前不会订阅第二个输入源,因此严格来说,相较于上面弹珠图,下面的弹珠图更能准确反映其行为:

This makes it easier to see why Concat produces the output it does. But since hot is a hot source here, this diagram fails to convey the fact that hot is producing items entirely on its own schedule. In a scenario where hot had multiple subscribers, then the earlier diagram would arguably be better because it correctly reflects every event available from hot (regardless of however many listeners might be subscribed at any particular moment). But although this convention works for hot sources, it doesn't work for cold ones, which typically start producing items upon subscription. A source returned by Timer produces items on a regular schedule, but that schedule starts at the instant when subscription occurs. That means that if there are multiple subscriptions, there are multiple schedules. Even if I have just a single IObservable<long> returned by Observable.Timer, each distinct subscriber will get items on its own schedule—subscribers receive events at a regular interval starting from whenever they happened to subscribe. So for cold observables, it typically makes sense to use the convention used by this second diagram, in which we're looking at the events received by one particular subscription to a source.

这就更容易理解为什么 Concat 会产生这样的输出结果。但在这个案例中,由于 hot 是一个热源,当前图示未能准确传达一个关键事实: hot 完全按照自己的时间线独立生成数据项。如果存在 hot 拥有多个订阅者的场景,那么前一种图示可能更合适,因为它能正确反映 hot 发出的所有事件(无论在任何特定时刻有多少订阅者在监听)。然而,这种图示惯例虽然适用于热源,却不适用于冷源——因为冷源通常只在订阅时才开始生成数据。以 Timer 返回的冷源为例:它虽然会按照固定时间间隔生成数据项,但这个时间线的起点是订阅发生的时刻。这意味着如果有多个订阅存在,就会产生多个独立的时间线。即使我们只有一个由 Observable.Timer返回的 IObservable<long> ,每个不同的订阅者都会获得基于自身订阅时刻开始计算的时间线数据——订阅者接收事件的固定时间间隔,起点是他们各自订阅的瞬间。因此,对于冷观察对象而言,采用第二种图示惯例(即关注某个特定订阅接收的事件序列)通常更符合实际逻辑。

Most of the time we can get away with ignoring this subtlety, quietly using whichever convention suits us. To paraphrase Humpty Dumpty: when I use a marble diagram, it means just what I choose it to mean—neither more nor less. But when you're combining hot and cold sources together, there might not be one obviously best way to represent this in a marble diagram. We could even do something like this, where we describe the events that hot represents separately from the events seen by a particular subscription to hot.

大多数情况下,我们可以暂时忽略这一细微差别,灵活采用最适合当前场景的图示惯例。借用矮胖子的话来说:“当我使用弹珠图时,它的含义完全由我来定义——不多也不少”。但当你需要将热源(hot observables)与冷源(cold observables)结合在一起时,弹珠图中可能就没有一种绝对最优的表现方式了。我们甚至可以尝试类似这样的做法:“将热源本身的事件与某个特定订阅所观察到的事件分开描述”。

We're using a distinct 'lane' in the marble diagram to represent the events seen by a particular subscription to a source. With this technique, we can also show what would happen if you pass the same cold source into Concat twice:

在弹珠图中,我们使用独立的“泳道”来表示某个特定订阅所观察到的事件序列。借助这种技术,我们还能展示这样一种场景:“若将同一个冷源(下图红色的源也是冷源)两次传入  Concat  运算符时会发生什么”:

This highlights the fact that that being a cold source, cold provides items separately to each subscription. We see the same three values emerging from the same source, but at different times.

这段描述突显了冷源的核心特性:作为冷源, cold 会为每个订阅单独提供项。我们看到相同的三个值从同一个源发出,但处于不同的时间点,它们在不同的订阅时间点被生成和传递。

Concatenating Multiple Sources    连接多个源

What if you wanted to concatenate more than two sequences? Concat has an overload accepting multiple observable sequences as an array. This is annotated with the params keyword, so you don't need to construct the array explicitly. You can just pass any number of arguments, and the C# compiler will generate the code to create the array for you. There's also an overload taking an IEnumerable<IObservable<T>>, in case the observables you want to concatenate are already in some collection.

若需拼接两个以上的序列该怎么办?Concat 提供了一个重载方法,可直接接收一个由多个可观察序列组成的数组作为参数。由于该参数使用了 params 关键字修饰,你无需显式构造数组——“只需传递任意数量的参数,C# 编译器会自动生成创建数组的代码”。此外,还有一个接受 IEnumerable<IObservable<T>>的重载版本,适用于待拼接的可观察序列已存在于某个集合中的情况。

public static IObservable<TSource> Concat<TSource>(
    params IObservable<TSource>[] sources)

public static IObservable<TSource> Concat<TSource>(
    this IEnumerable<IObservable<TSource>> sources)

The IEnumerable<IObservable<T>> overload evaluates sources lazily. It won't begin to ask it for source observables until someone subscribes to the observable that Concat returns, and it only calls MoveNext again on the resulting IEnumerator<IObservable<T>> when the current source completes meaning it's ready to start on the text. To illustrate this, the following example is an iterator method that returns a sequence of sequences and is sprinkled with logging. It returns three observable sequences each with a single value [1], [2] and [3]. Each sequence returns its value on a timer delay.

 IEnumerable<IObservable<T>> 的重载方法采用延迟求值方式处理数据源。它不会主动请求源可观察对象,直到有订阅者订阅了 Concat 方法返回的可观察对象。并且它只会在当前数据源完成(意味着可以开始处理下一个数据源)时,才会再次调用结果 IEnumerable<IObservable<T>> 的 MoveNext 方法。为了说明这一点,以下示例展示了一个返回嵌套序列的迭代器方法(添加了日志记录)。该方法返回三个可观察序列,每个序列分别包含单个值 [1]、[2] 和 [3]。每个序列都会通过定时器延迟返回其值。 

public IEnumerable<IObservable<long>> GetSequences()
{
    Console.WriteLine("GetSequences() called");
    Console.WriteLine("Yield 1st sequence");

    yield return Observable.Create<long>(o =>
    {
        Console.WriteLine("1st subscribed to");
        return Observable.Timer(TimeSpan.FromMilliseconds(500))
            .Select(i => 1L)
            .Finally(() => Console.WriteLine("1st finished"))
            .Subscribe(o);
    });

    Console.WriteLine("Yield 2nd sequence");

    yield return Observable.Create<long>(o =>
    {
        Console.WriteLine("2nd subscribed to");
        return Observable.Timer(TimeSpan.FromMilliseconds(300))
            .Select(i => 2L)
            .Finally(() => Console.WriteLine("2nd finished"))
            .Subscribe(o);
    });

    Thread.Sleep(1000); // Force a delay

    Console.WriteLine("Yield 3rd sequence");

    yield return Observable.Create<long>(o =>
    {
        Console.WriteLine("3rd subscribed to");
        return Observable.Timer(TimeSpan.FromMilliseconds(100))
            .Select(i=>3L)
            .Finally(() => Console.WriteLine("3rd finished"))
            .Subscribe(o);
    });

    Console.WriteLine("GetSequences() complete");
}

We can call this GetSequences method and pass the results to Concat, and then use our Dump extension method to watch what happens:

我们可以调用这个 GetSequences 方法,并将结果传递给 Concat,然后使用我们的 Dump 扩展方法来观察发生了什么:

GetSequences().Concat().Dump("Concat");

Here's the output:

以下是输出:

GetSequences() called
Yield 1st sequence
1st subscribed to
Concat-->1
1st finished
Yield 2nd sequence
2nd subscribed to
Concat-->2
2nd finished
Yield 3rd sequence
3rd subscribed to
Concat-->3
3rd finished
GetSequences() complete
Concat completed

Below is a marble diagram of the Concat operator applied to the GetSequences method. 's1', 's2' and 's3' represent sequence 1, 2 and 3. Respectively, 'rs' represents the result sequence.

下面是应用于 GetSequences 方法的 Concat 运算符的弹珠图。's1'、's2' 和 's3' 分别表示序列 1、2 和 3。相应地,'rs' 表示结果序列。

You should note that once the iterator has executed its first yield return to return the first sequence, the iterator does not continue until the first sequence has completed. The iterator calls Console.WriteLine to display the text Yield 2nd sequence immediately after that first yield return, but you can see that message doesn't appear in the output until after we see the Concat-->1 message showing the first output from Concat, and also the 1st finished message, produced by the Finally operator, which runs only after that first sequence has completed. (The code also makes that first source delay for 500ms before producing its value, so that if you run this, you can see that everything stops for a bit until that first source produces its single value then completes.) Once the first source completes, the GetSequences method continues (because Concat will ask it for the next item once the first observable source completes). When GetSequences provides the second sequence with another yield returnConcat subscribes to that, and again GetSequences makes no further progress until that second observable sequence completes. When asked for the third sequence, the iterator itself waits for a second before producing that third and final value, which you can see from the gap between the end of s2 and the start of s3 in the diagram.

需注意:迭代器首次执行 yield return 返回第一个序列后,将暂停执行直到该序列完成。虽然在第一个 yield return 之后立即调用了 Console.WriteLine 显示" Yield 2nd sequence "信息,但实际输出中该信息直到出现 Concat-->1 (显示 Concat首个输出)和 1st finished (由 Finally 运算符产生)后才显示。这是因为 Finally 运算符仅在首个序列完成后运行(代码中还让首个源在产生值前延迟500毫秒,因此运行时可以观察到所有操作都会暂停,直到该源生成单个值并完成)。当首个源完成后, GetSequences 方法才会继续执行(因为 Concat会在首个可观察源完成后才请求下一项)。当 GetSequences 通过第二个 yield return 提供第二个序列时, Concat订阅该序列,此时迭代器再次暂停直到第二个可观察序列完成。当请求第三个序列时,迭代器本身会等待1秒才生成第三个最终值,这从图中 s2 结束与 s3 开始之间的间隔可以看出。

Prepend

There's one particular scenario that Concat supports, but in a slightly cumbersome way. It can sometimes be useful to make a sequence that always emits some initial value immediately. Take the example I've been using a lot in this book, where ships transmit AIS messages to report their location and other information: in some applications you might not want to wait until the ship happens next to transmit a message. You could imagine an application that records the last known location of any vessel. This would make it possible for the application to offer, say, an IObservable<IVesselNavigation> which instantly reports the last known information upon subscription, and which then goes on to supply any newer messages if the vessel produces any.

虽然 Concat 支持某种特殊场景,但其支持方式略显笨拙。有时我们需要创建能立即发出初始值的序列。以本书多次使用的船舶AIS(自动识别系统)消息传输为例:某些应用场景中,开发者可能不希望被动等待船舶下一次发送消息。设想一个记录所有船只最后已知位置的应用,该应用可以提供 IObservable<IVesselNavigation> 序列——该序列在订阅时立即上报最后已知信息,若船舶后续有新消息产生则继续推送更新。

How would we implement this? We want initially cold-source-like behaviour, but transitioning into hot. So we could just concatenate two sources. We could use Observable.Return to create a single-element cold source, and then concatenate that with the live stream:

我们应该如何实现这个需求?我们需要创建初始具有冷源特性、但随后能过渡到热源行为的可观察序列。解决方案是连接两个数据源——首先使用 Observable.Return 创建包含单个元素的冷源,然后将其与实时流进行连接:

IVesselNavigation lastKnown = ais.GetLastReportedNavigationForVessel(mmsi);
IObservable<IVesselNavigation> live = ais.GetNavigationMessagesForVessel(mmsi);

IObservable<IVesselNavigation> lastKnownThenLive = Observable.Concat(
    Observable.Return(lastKnown), live);

This is a common enough requirement that Rx supplies Prepend that has a similar effect. We can replace the final line with:

由于这是Rx中常见的需求,框架直接提供了具有类似效果的 Prepend 运算符。我们可以将最后一行代码替换为:

IObservable<IVesselNavigation> lastKnownThenLive = live.Prepend(lastKnown);

This observable will do exactly the same thing: subscribers will immediately receive the lastKnown, and then if the vessel should emit further navigation messages, they will receive those too. By the way, for this scenario you'd probably also want to ensure that the look up of the "last known" message happens as late as possible. We can delay this until the point of subscription by using Defer:

该可观察序列将实现完全相同的功能——订阅者会立即收到 lastKnown值,若船舶后续发出新的导航消息,也会继续接收更新。需要特别说明的是,在此场景中通常还需要确保"最后已知"消息的查询尽可能延迟。我们可以通过Defer运算符将查询操作推迟到订阅发生时:

public static IObservable<IVesselNavigation> GetLastKnownAndSubsequenceNavigationForVessel(uint mmsi)
{
    return Observable.Defer<IVesselNavigation>(() =>
    {
        // This lambda will run each time someone subscribes.
        IVesselNavigation lastKnown = ais.GetLastReportedNavigationForVessel(mmsi);
        IObservable<IVesselNavigation> live = ais.GetNavigationMessagesForVessel(mmsi);

        return live.Prepend(lastKnown);
    }
}

StartWith might remind you of BehaviorSubject<T>, because that also ensures that consumers receive a value as soon as they subscribe. It's not quite the same: BehaviorSubject<T> caches the last value its own source emits. You might think that would make it a better way to implement this vessel navigation example. However, since this example is able to return a source for any vessel (the mmsi argument is a Maritime Mobile Service Identity uniquely identifying a vessel) it would need to keep a BehaviorSubject<T> running for every single vessel you were interested in, which might be impractical.

StartWith 运算符可能让您联想到 BehaviorSubject<T>,因为两者都能确保消费者订阅时立即获得值。但存在本质区别: BehaviorSubject<T> 会缓存其自身源发出的最新值。您可能认为这更适合实现船舶导航示例,然而由于该示例需要为任意一个船舶(mmsi 参数是一个海上移动服务身份,用于唯一标识一艘船舶)返回相应的数据源,若采用 BehaviorSubject<T> 则需为每个关注船舶维护运行中的实例,这在实践中可能不可行。

BehaviorSubject<T> can hold onto only one value, which is fine for this AIS scenario, and Prepend shares this limitation. But what if you need a source to begin with some particular sequence?

BehaviorSubject<T> 仅能保存单个值,这对于AIS场景来说是可接受的,而 Prepend 运算符也存在相同限制。但如果需要数据源以一个特定序列开始该怎么办?

StartWith

StartWith is a generalization of Prepend that enables us to provide any number of values to emit immediately upon subscription. As with Prepend, it will then go on to forward any further notifications that emerge from the source.

StartWith 是 Prepend 的泛化版本,它允许我们在订阅时立即发出任意数量的初始值。与 Prepend 一样,它将继续转发源序列的所有后续通知。

As you can see from its signature, this method takes a params array of values so you can pass in as many or as few values as you need:

如方法签名所示,该方法接受 params 修饰的值数组参数,您可按需传递任意数量参数:

// prefixes a sequence of values to an observable sequence.
public static IObservable<TSource> StartWith<TSource>(
    this IObservable<TSource> source, 
    params TSource[] values)

There's also an overload that accepts an IEnumerable<T>. Note that Rx will not defer its enumeration of this. StartWith immediately converts the IEnumerable<T> into an array before returning.

此外还存在接受 IEnumerable<T>的重载版本。需特别注意:Rx框架不会延迟枚举操作,StartWith 会在返回前立即将 IEnumerable<T>转换为数组。

StartsWith is not a common LINQ operator, and its existence is peculiar to Rx. If you imagine what StartsWith would look like in LINQ to Objects, it would not be meaningfully different from Concat. There's a difference in Rx because StartsWith effectively bridges between pull and push worlds. It effectively converts the items we supply into an observable, and it then concatenates the source argument onto that.

StartsWith 并非LINQ的常规运算符,其存在是Rx框架特有的设计。若设想LINQ to Objects中的StartsWith 实现,其功能与 Concat运算符并无实质差异。但在Rx中,StartsWith 的特殊性在于它有效地在拉取和推送世界之间建立了桥梁——实际上将我们提供的项转换为可观察序列,然后将源参数连接到该序列之后。

Append

The existence of Prepend might lead you to wonder whether there is an Append for adding a single item onto the end of any IObservable<T>. After all, this is a common LINQ operator; LINQ to Objects has an Append implementation, for example. And Rx does indeed supply such a thing:

 Prepend 运算符的存在自然会引发疑问:是否也存在对应的 Append 运算符,用于在IObservable<T>序列末尾添加单个项?毕竟这是LINQ的常规操作——例如LINQ to Objects就内置了 Append 实现。事实上,Rx框架确实提供了这样的运算符:

IObservable<string> oneMore = arguments.Append("And another thing...");

There is no corresponding EndWith. There's no fundamental reason that there couldn't be such a thing it's just that apparently there's not much demand—the Rx repository has not yet had a feature request. So although the symmetry of Prepend and Append does suggest that there could be a similar symmetry between StartWith and an as-yet-hypothetical EndWith, the absence of this counterpart doesn't seem to have caused any problems. There's an obvious value to being able to create observable sources that always immediately produce a useful output; it's not clear what EndWith would be useful for, besides satisfying a craving for symmetry.

目前Rx框架中不存在对应的 EndWith运算符。这并非出于技术限制(理论上完全可以实现),而是由于实际需求不足——Rx代码库至今未收到相关功能请求。尽管 Prepend 与 Append 的对称性暗示 StartWith 理应存在类似对称运算符,但这一对应项的缺失似乎并未引发实际问题。能够创建立即产出有效输出的可观察源具有明确价值;而 EndWith除了满足对称性追求外,其实际应用场景尚不明确。

DefaultIfEmpty

The next operator we'll examine doesn't strictly performs sequential combination. However, it's a very close relative of Append and Prepend. Like those operators, this will emit everything their source does. And like those operators, DefaultIfEmpty takes one additional item. The difference is that it won't always emit that additional item.

接下来要讨论的 DefaultIfEmpty 运算符并不严格属于顺序组合操作符,但与 Append 和 Prepend 非常相似,它会转发源序列的所有项;也如它们一般, DefaultIfEmpty 会携带一个额外项。关键区别在于:该运算符仅在源序列为空时才会发射这个额外项(默认值)。

Whereas Prepend emits its additional item at the start, and Append emits its additional item at the end, DefaultIfEmpty emits the additional item only if the source completes without producing anything. So this provides a way of guaranteeing that an observable will not be empty.

与 Prepend 在序列开头添加额外项、 Append 在末尾追加项不同, DefaultIfEmpty 仅当源序列完成且未产生任何项时才会发出额外项。该运算符由此提供了确保可观察序列非空的保障机制。

You don't have to supply DefaultIfEmpty with a value. If you use the overload in which you supply no such value, it will just use default(T). This will be a zero-like value for struct types and null for reference types.

使用 DefaultIfEmpty 时可不指定参数值。若调用无参重载,运算符将自动使用 default(T)作为默认值——对于结构体类型将生成类零值,引用类型则返回 null 。

Repeat

The final operator that combines sequences sequentially is Repeat. It allows you to simply repeat a sequence. It offers overloads where you can specify the number of times to repeat the input, and one that repeats infinitely:

最后一个按顺序组合序列的操作符是 Repeat。它允许你简单地重复一个序列。它提供了多个重载版本,你可以在其中指定重复输入的次数,以及实现无限循环的无限重载:

// Repeats the observable sequence a specified number of times.
public static IObservable<TSource> Repeat<TSource>(
    this IObservable<TSource> source, 
    int repeatCount)

// Repeats the observable sequence indefinitely and sequentially.
public static IObservable<TSource> Repeat<TSource>(
    this IObservable<TSource> source)

Repeat resubscribes to the source for each repetition. This means that this will only strictly repeat if the source produces the same items each time you subscribe. Unlike the ReplaySubject<T>, this doesn't store and replay the items that emerge from the source. This means that you normally won't want to call Repeat on a hot source. (If you really want repetition of the output of a hot source, a combination of Replay and Repeat might fit the bill.)

Repeat 运算符通过重新订阅源序列实现循环逻辑。这意味着只有当源在每次订阅时都能生成相同项时,才能保证严格意义上的重复。与 ReplaySubject<T>不同,Repeat 不会缓存并重放源序列的项。因此通常不建议在热源上使用 Repeat 运算符(若确实需要重复热源输出,可结合 Replay 与 Repeat 运算符实现)。

If you use the overload that repeats indefinitely, then the only way the sequence will stop is if there is an error or the subscription is disposed of. The overload that specifies a repeat count will stop on error, un-subscription, or when it reaches that count. This example shows the sequence [0,1,2] being repeated three times.

若使用无限循环重载,序列仅会在发生错误或订阅被释放时终止。而指定重复次数的重载版本则会在以下三种情况之一终止:遇到错误、取消订阅或达到指定重复次数。以下示例展示了序列[0,1,2]被重复三次的执行过程:

var source = Observable.Range(0, 3);
var result = source.Repeat(3);

result.Subscribe(
    Console.WriteLine,
    () => Console.WriteLine("Completed"));

Output:

输出:

0
1
2
0
1
2
0
1
2
Completed

Concurrent sequences    并发序列

We'll now move on to operators for combining observable sequences that might produce values concurrently.

接下来我们将转向用于组合可能并发产生值的可观察序列的运算符。

Amb

Amb is a strangely named operator. It's short for ambiguous, but that doesn't tell us much more than Amb. If you're curious about the name you can read about the origins of Amb in Appendix D, but for now, let's look at what it actually does. Rx's Amb takes any number of IObservable<T> sources as inputs, and waits to see which, if any, first produces some sort of output. As soon as this happens, it immediately unsubscribes from all of the other sources, and forwards all notifications from the source that reacted first.

Amb 是一个命名奇特的运算符(名称源自"ambiguous/模糊选择",具体渊源可参考附录D)。该运算符接收多个 IObservable<T> 源作为输入,等待并观察哪个源首先产生输出。一旦检测到某个源率先响应,立即取消订阅所有其他源,并持续转发该胜出源的所有通知。

Why is that useful?

这有什么用处呢?

A common use case for Amb is when you want to produce some sort of result as quickly as possible, and you have multiple options for obtaining that result, but you don't know in advance which will be fastest. Perhaps there are multiple servers that could all potentially give you the answer you want, and it's impossible to predict which will have the lowest response time. You could send requests to all of them, and then just use the first to respond. If you model each individual request as its own IObservable<T>Amb can handle this for you. Note that this isn't very efficient: you're asking several servers all to do the same work, and you're going to discard the results from most of them. (Since Amb unsubscribes from all the sources it's not going to use as soon as the first reacts, it's possible that you might be able to send a message to all the other servers to cancel the request. But this is still somewhat wasteful.) But there may be scenarios in which timeliness is crucial, and for those cases it might be worth tolerating a bit of wasted effort to produce faster results.

 Amb 的一个常见使用场景是:当你希望尽可能快地生成某种结果,并且有多个可获取该结果的选项,但事先无法预知哪个选项会最快。例如,可能存在多个服务器都能提供你所需的答案,但无法预测哪个服务器的响应时间最短。这时你可以向所有服务器发送请求,然后只使用第一个响应。如果将每个单独的请求建模为各自的 IObservable<T>, Amb 可以为你处理这种情况。需要注意的是,这种方式效率不高:你让多个服务器都执行相同的工作,却会丢弃其中大部分的结果。(由于 Amb 在第一个响应出现后就会取消订阅所有不再需要的数据源,理论上你可以向其他所有服务器发送消息取消请求。但这仍然存在一定程度的资源浪费。)不过在某些时效性至关重要的场景中,为了更快地获得结果,容忍少量资源浪费可能是值得的。

Amb is broadly similar to Task.WhenAny, in that it lets you detect when the first of multiple sources does something. However, the analogy is not precise. Amb automatically unsubscribes from all of the other sources, ensuring that everything is cleaned up. With Task you should always ensure that you eventually observe all tasks in case any of them faulted.

Amb 与 Task.WhenAny 大体相似,因为它允许你检测多个数据源中第一个触发动作的源。然而,这种类比并不完全准确。 Amb 会自动取消订阅所有其他数据源,确保所有资源都被清理干净。而使用 Task 时,你必须始终确保最终处理所有任务(例如观察任务结果或处理异常),以防其中某些任务出现故障。

To illustrate Amb's behaviour, here's a marble diagram showing three sequences, s1s2, and s3, each able to produce a sequence values. The line labelled r shows the result of passing all three sequences into Amb. As you can see, r provides exactly the same notifications as s1, because in this example, s1 was the first sequence to produce a value.

为了说明 Amb 的行为,这里展示一个弹珠图:三个序列 s1、s2 和 s3 各自能够生成一系列值。标有 r 的线条表示将这三个序列传入 Amb 后的结果。如你所见,r 提供与 s1 完全相同的通知,因为在此示例中,s1 是第一个生成值的序列。

This code creates exactly the situation described in that marble diagram, to verify that this is indeed how Amb behaves:

以下代码精确复现了弹珠图所描述的情境,以验证 Amb 的行为确实如此:

var s1 = new Subject<int>();
var s2 = new Subject<int>();
var s3 = new Subject<int>();

var result = Observable.Amb(s1, s2, s3);

result.Subscribe(
    Console.WriteLine,
    () => Console.WriteLine("Completed"));

s1.OnNext(1);
s2.OnNext(99);
s3.OnNext(8);
s1.OnNext(2);
s2.OnNext(88);
s3.OnNext(7);
s2.OnCompleted();
s1.OnNext(3);
s3.OnNext(6);
s1.OnNext(4);
s1.OnCompleted();
s3.OnCompleted();

Output:

输出:

1
2
3
4
Completed

If we changed the order so that s2.OnNext(99) came before the call to s1.OnNext(1); then s2 would produce values first and the marble diagram would look like this.

如果我们调整顺序,让 s2.OnNext(99) 在 s1.OnNext(1)之前被调用,那么 s2 将率先生成值,此时弹珠图的呈现形态会变为如下所示:

There are a few overloads of Amb. The preceding example used the overload that takes a params array of sequences. There's also an overload that takes exactly two sources, avoiding the array allocation that occurs with params. Finally, you could pass in an IEnumerable<IObservable<T>>. (Note that there are no overloads that take an IObservable<IObservable<T>>Amb requires all of the source observables it monitors to be supplied up front.)

 Amb提供了多个重载版本。前面的示例使用了接受 params数组形式序列的重载。此外,还有一个只接受两个输入源的重载,可避免 params参数带来的数组分配开销。最后,你还可以传入一个 IEnumerable<IObservable<T>>集合。(注意:没有接受 IObservable<IObservable<T>>的重载。 Amb要求所有被监控的数据源必须预先全部提供。)

// Propagates the observable sequence that reacts first.
public static IObservable<TSource> Amb<TSource>(
    this IObservable<TSource> first, 
    IObservable<TSource> second)
{...}
public static IObservable<TSource> Amb<TSource>(
    params IObservable<TSource>[] sources)
{...}
public static IObservable<TSource> Amb<TSource>(
    this IEnumerable<IObservable<TSource>> sources)
{...}

Reusing the GetSequences method from the Concat section, we see that Amb evaluates the outer (IEnumerable) sequence completely before subscribing to any of the sequences it returns.

复用来自Concat 部分的 GetSequences 方法,我们可以观察到: Amb 在订阅其返回的序列中的任何一个之前,会先完全评估外层(IEnumerable)序列。

GetSequences().Amb().Dump("Amb");

Output:

输出:

GetSequences() called
Yield 1st sequence
Yield 2nd sequence
Yield 3rd sequence
GetSequences() complete
1st subscribed to
2nd subscribed to
3rd subscribed to
Amb-->3
Amb completed

Here is the marble diagram illustrating how this code behaves:

以下是弹珠图,展示了这段代码的行为:

Remember that GetSequences produces its first two observables as soon as it is asked for them, and then waits for 1 second before producing the third and final one. But unlike ConcatAmb won't subscribe to any of its sources until it has retrieved all of them from the iterator, which is why this marble diagram shows the subscriptions to all three sources starting after 1 second. (The first two sources were available earlier—Amb would have started enumerating the sources as soon as subscription occurred, but it waited until it had all three before subscribing, which is why they all appear over on the right.) The third sequence has the shortest delay between subscription and producing its value, so although it's the last observable returned, it is able to produce its value the fastest even though there are two sequences yielded one second before it (due to the Thread.Sleep).

需要记住的是, GetSequences 方法在被调用时会立即生成前两个 Observable,随后等待 1 秒才会生成第三个(即最后一个)Observable。但与 Concat不同, Amb 在从迭代器中获取所有数据源之前,不会订阅其中的任何一个。这解释了为何弹珠图显示所有三个数据源的订阅均在 1 秒后才开始。(前两个数据源本可以更早被获取 —— Amb 在订阅发生时就会立即开始枚举数据源,但它会等待所有三个数据源就绪后才进行订阅,因此它们在图中均出现在右侧。)第三个序列在订阅到生成值之间的延迟最短,因此尽管它是最后返回的 Observable,却能够比其他两个序列(即使它们早 1 秒被生成,因 Thread.Sleep的存在)更快地产生值。

Merge

The Merge extension method takes multiple sequences as its input. Any time any of those input sequences produces a value, the observable returned by Merge produces that same value. If the input sequences produce values at the same time on different threads, Merge handles this safely, ensuring that it delivers items one at a time.

 Merge 扩展方法接收多个序列作为输入。每当其中任意一个输入序列生成值时,由 Merge 返回的 Observable 也会生成相同的值。即使这些输入序列在不同线程上同时生成值, Merge 也能够安全处理,确保每次只传递一个数据项。

Since Merge returns a single observable sequence that includes all of the values from all of its input sequences, there's a sense in which it is similar to Concat. But whereas Concat waits until each input sequence completes before moving onto the next, Merge supports concurrently active sequences. As soon as you subscribe to the observable returned by Merge, it immediately subscribes to all of its inputs, forwarding everything any of them produces. This marble diagram shows two sequences, s1 and s2, running concurrently and r shows the effect of combining these with Merge: the values from both source sequences emerge from the merged sequence.

由于 Merge 返回一个单一的可观察序列,该序列包含所有输入序列的所有值,因此在某种意义上,它与 Concat 相似。但是, Concat 会等待当前输入序列完成后再处理下一个,而Merge 支持并发活动的序列。一旦你订阅了 Merge 返回的可观察对象,它就会立即订阅其所有输入源,并转发它们产生的所有内容。这个弹珠图显示了两个序列 s1 和 s2 同时运行,而 r 则显示了使用 Merge 将它们合并的效果:合并后的序列会发出两个源序列中的值。

The result of a Merge will complete only once all input sequences complete. However, the Merge operator will error if any of the input sequences terminates erroneously (at which point it will unsubscribe from all its other inputs).

 Merge 运算符返回的结果序列会在所有输入序列完成后才完成。然而,如果任何一个输入序列因错误终止(此时它会取消订阅所有其他输入源),则 Merge 运算符会立即传播该错误。

If you read the Creating Observables chapter, you've already seen one example of Merge. I used it to combine the individual sequences representing the various events provided by a FileSystemWatcher into a single stream at the end of the 'Representing Filesystem Events in Rx' section. As another example, let's look at AIS once again. There is no publicly available single global source that can provide all AIS messages across the entire globe as an IObservable<IAisMessage>. Any single source is likely to cover just one area, or maybe even just a single AIS receiver. With Merge, it's straightforward to combine these into a single source:

如果你阅读过《创建可观察对象》的章节,那么你已经见过一个 Merge的示例。在“在 Rx 中表示文件系统事件”一节的最后,我使用 Merge将表示由 FileSystemWatcher 提供的各种事件的独立序列合并为一个统一的流。再举一个例子,让我们再次考虑 AIS(自动识别系统)。目前没有公开可用的单一全球数据源能够以 IObservable<IAisMessage> 的形式提供全球范围内的所有 AIS 消息。任何单一数据源可能仅覆盖某一区域,甚至仅来自单个 AIS 接收器。利用 Merge,可以轻松将它们合并为一个单一的数据源:

IObservable<IAisMessage> station1 = aisStations.GetMessagesFromStation("AdurStation");
IObservable<IAisMessage> station2 = aisStations.GetMessagesFromStation("EastbourneStation");

IObservable<IAisMessage> allMessages = station1.Merge(station2);

If you want to combine more than two sources, you have a few options:

如果你想合并超过两个源,你有几个选项:

  • Chain Merge operators together e.g. s1.Merge(s2).Merge(s3)   将 Merge 运算符链接在一起,例如 s1.Merge(s2).Merge(s3)
  • Pass a params array of sequences to the Observable.Merge static method. e.g. Observable.Merge(s1,s2,s3)  将序列的 params 数组传递给 Observable.Merge 静态方法,例如 Observable.Merge(s1, s2, s3)
  • Apply the Merge operator to an IEnumerable<IObservable<T>>.    对 IEnumerable<IObservable<T>> 应用 Merge 操作符
  • Apply the Merge operator to an IObservable<IObservable<T>>.      对 IObservable<IObservable<T>>应用 Merge 操作符

The overloads look like this:

 Merge的重载版本如下所示:

/// Merges two observable sequences into a single observable sequence.
/// Returns a sequence that merges the elements of the given sequences.
public static IObservable<TSource> Merge<TSource>(
    this IObservable<TSource> first, 
    IObservable<TSource> second)
{...}

// Merges all the observable sequences into a single observable sequence.
// The observable sequence that merges the elements of the observable sequences.
public static IObservable<TSource> Merge<TSource>(
    params IObservable<TSource>[] sources)
{...}

// Merges an enumerable sequence of observable sequences into a single observable sequence.
public static IObservable<TSource> Merge<TSource>(
    this IEnumerable<IObservable<TSource>> sources)
{...}

// Merges an observable sequence of observable sequences into an observable sequence.
// Merges all the elements of the inner sequences in to the output sequence.
public static IObservable<TSource> Merge<TSource>(
    this IObservable<IObservable<TSource>> sources)
{...}

As the number of sources being merged goes up, the operators that take collections have an advantage over the first overload. (I.e., s1.Merge(s2).Merge(s3) performs slightly less well than Observable.Merge(new[] { s1, s2, s3 }), or the equivalent Observable.Merge(s1, s2, s3).) However, for just three or four, the differences are small, so in practice you can choose between the first two overloads as a matter of your preferred style. (If you're merging 100 sources or more the differences are more pronounced, but by that stage, the you probably wouldn't want to use the chained call style anyway.) The third and fourth overloads allow to you merge sequences that can be evaluated lazily at run time.

随着要合并的源数量的增加,接受集合作为参数的重载版本相比第一个重载(链式调用)更具性能优势。(即,s1.Merge(s2).Merge(s3) 的性能略逊于 Observable.Merge(new[] { s1, s2, s3 }) 或等效的 Observable.Merge(s1, s2, s3)。)然而,对于只有 3~4 数据源的情况,性能差异很小,因此在实际使用中,你可以根据个人喜好的风格在前两个重载之间进行选择。(如果你正在合并 100 个或更多的源,性能差异会更显著,但此时链式调用的方式本身也不适用。)第三个和第四个重载允许你在运行时懒加载地合并序列。

That last Merge overload that takes a sequence of sequences is particularly interesting, because it makes it possible for the set of sources being merged to grow over time. Merge will remain subscribed to sources for as long as your code remains subscribed to the IObservable<T> that Merge returns. So if sources emits more and more IObservable<T>s over time, these will all be included by Merge.

最后一个接受嵌套序列作为参数的 Merge 重载尤为有趣,因为它使得被合并的源集合能够随时间动态扩展。只要你的代码保持订阅由 Merge 返回的 IObservable<T>, Merge就会持续订阅这些数据源 sources 。因此,如果 sources (即外层IObservable<IObservable<TSource>>)持续发出新的 IObservable<T>,所有这些内部序列都会被 Merge动态纳入合并范围。

That might sound familiar. The SelectMany operator, which is able to flatten multiple observable sources back out into a single observable source. This is just another illustration of why I've described SelectMany as a fundamental operator in Rx: strictly speaking we don't need a lot of the operators that Rx gives us because we could build them using SelectMany. Here's a simple re-implementation of that last Merge overload using SelectMany:

这听起来可能似曾相识。SelectMany 运算符同样能够将多个可观察数据源扁平化(flatten)为单一的可观察数据源。这再次印证了为何我将 SelectMany 描述为 Rx 中的基础运算符:严格来说,我们并不需要 Rx 提供的许多其他运算符,因为可以通过 SelectMany 来构建它们。以下是一个使用 SelectMany 重新实现上述最后一个 Merge 重载的简单示例:

public static IObservable<T> MyMerge<T>(this IObservable<IObservable<T>> sources) =>
    sources.SelectMany(source => source);

As well as illustrating that we don't technically need Rx to provide that last Merge for us, it's also a good illustration of why it's helpful that it does. It's not immediately obvious what this does. Why are we passing a lambda that just returns its argument? Unless you've seen this before, it can take some thought to work out that SelectMany expects us to pass a callback that it invokes for each incoming item, but that our input items are already nested sequences, so we can just return each item directly, and SelectMany will then take that and merge everything it produces into its output stream. And even if you have internalized SelectMany so completely that you know right away that this will just flatten sources, you'd still probably find Observable.Merge(sources) a more direct expression of intent. (Also, since Merge is a more specialized operator, Rx is able to provide a very slightly more efficient implementation of it than the SelectMany version shown above.)

除了说明从技术角度而言我们并不需要 Rx 专门提供最后一个 Merge 重载外,这也很好地解释了为何 Rx 仍然提供该操作符的实际价值。上述 SelectMany 的用法并不直观,其行为意图并不一目了然。为什么我们要传递一个直接返回输入参数的 lambda 表达式?除非你之前见过这种模式,否则可能需要一番思考才能理解: SelectMany 要求我们传递一个回调函数,该函数会被调用于每个输入项,而我们的输入项本身已经是嵌套的可观察序列,因此可以直接返回每个输入项本身。随后, SelectMany 会自动将其扁平化,将所有生成的元素合并到输出流中。即便你已经完全掌握了 SelectMany 的机制,知道这样做能够将嵌套的 sources序列扁平化,你仍然会发现 Observable.Merge(sources) 能更直接地表达意图。(此外,由于  Merge 是一个更专用的操作符,Rx 能够提供比上述 SelectMany 版本略微更高效的实现。)

If we again reuse the GetSequences method, we can see how the Merge operator works with a sequence of sequences.

再次复用 GetSequences 方法,我们可以观察 Merge 运算符如何处理嵌套的序列集合。

GetSequences().Merge().Dump("Merge");

Output:

输出:

GetSequences() called
Yield 1st sequence
1st subscribed to
Yield 2nd sequence
2nd subscribed to
Merge --> 2
Merge --> 1
Yield 3rd sequence
3rd subscribed to
GetSequences() complete
Merge --> 3
Merge completed

As we can see from the marble diagram, s1 and s2 are yielded and subscribed to immediately. s3 is not yielded for one second and then is subscribed to. Once all input sequences have completed, the result sequence completes.

从弹珠图中我们可以看出,s1 和 s2 立即产生并被订阅。s3 在一秒后才产生并被订阅。一旦所有输入序列都完成了,合并后的结果序列也随之完成。

For each of the Merge overloads that accept variable numbers of sources (either via an array, an IEnumerable<IObservable<T>>, or an IObservable<IObservable<T>>) there's an additional overload adding a maxconcurrent parameter. For example:

对于每个接受可变数量数据源的 Merge 重载(无论是通过数组、IEnumerable<IObservable<T>> 还是 IObservable<IObservable<T>> 传入),Rx 还提供了额外支持 maxConcurrent 参数的重载。例如:

public static IObservable<TSource> Merge<TSource>(this IEnumerable<IObservable<TSource>> sources, int maxConcurrent)

This enables you to limit the number of sources that Merge accepts inputs from at any single time. If the number of sources available exceeds maxConcurrent (either because you passed in a collection with more sources, or because you used the IObservable<IObservable<T>-based overload and the source emitted more nested sources than maxConcurrentMerge will wait for existing sources to complete before moving onto new ones. A maxConcurrent of 1 makes Merge behave in the same way as Concat.

通过 maxConcurrent 参数,你可以限制 Merge 在任何时刻同时接收输入的数据源数量。如果可用数据源的数量超过 maxConcurrent 设置的值(可能是由于传入的集合包含更多数据源,或是使用了基于 IObservable<IObservable<T>的重载且源发射了超过 maxConcurrent 数量的嵌套数据源), Merge 将等待现有数据源完成后,再处理新的数据源。特别地,当  maxConcurrent 设置为 1时, Merge  的行为将与 Concat完全一致 —— 即所有数据源会按顺序逐一处理,而非并发执行。

Switch

Rx's Switch operator takes an IObservable<IObservable<T>>, and produces notifications from the most recent nested observable. Each time its source produces a new nested IObservable<T>Switch unsubscribes from the previous nested source (unless this is the first source, in which case there won't be a previous one) and subscribes to the latest one.

Rx的Switch运算符接收一个IObservable<IObservable<T>>类型的输入,并仅从最新的嵌套可观察对象中发出通知。每当其源序列产生一个新的嵌套IObservable<T>时,Switch会执行以下操作:

  1. 取消订阅前一个嵌套源(除非这是第一个源Observable,这种情况下不会有前一个,则无此步骤);

  2. 立即订阅最新产生的嵌套可观察对象

Switch can be used in a 'time to leave' type feature for a calendar application. In fact you can see the source code for a modified version of how Bing provides (or at least provided; the implementation may have changed) notifications telling you that it's time to leave for an appointment. Since that's derived from a real example, it's a little complex, so I'll describe just the essence here.

Switch运算符可用于实现日历应用中的"该出发了"提醒功能。事实上,你可以参考Bing日历(或至少其历史版本)通知用户出发时间的源码实现的修改版本(注意:实际实现可能已变更)。由于这是基于真实案例的简化,其复杂性较高,以下仅阐述核心原理:

The basic idea with a 'time to leave' notification is that we using map and route finding services to work out the expected journey time to get to wherever the appointment is, and to use the Timer operator to create an IObservable<T> that will produce a notification when it's time to leave. (Specifically this code produces an IObservable<TrafficInfo> which reports the proposed route for the journey, and expected travel time.) However, there are two things that can change, rendering the initial predicted journey time useless. First, traffic conditions can change. When the user created their appointment, we have to guess the expected journey time based on how traffic normally flows at the time of day in question. However, if there turns out to be really bad traffic on the day, the estimate will need to be revised upwards, and we'll need to notify the user earlier.

“出发时间”通知的基本思路是:我们利用地图和路线规划服务来计算到达约定地点的预计行程时间,并使用Timer运算符创建一个 IObservable<T> 。该可观察对象会在需要出发的时刻触发通知(具体而言,这段代码生成的是 IObservable<TrafficInfo> ,其中包含建议的行车路线和预计行程时间)。然而,有两种动态变化因素会导致初始预测的行程时间失效:

第一种是,交通状况可能发生变化。当用户创建预约时,我们只能基于该时段通常的交通流量来推测行程时间。但若实际当天出现严重交通拥堵,就需要上调预估时间并提前通知用户。

The other thing that can change is the user's location. This will also obviously affect the predicted journey time.

另一个可能发生变化的是用户的位置。这显然也会影响预计的行程时间。

To handle this, the system will need observable sources that can report changes in the user's location, and changes in traffic conditions affecting the proposed journey. Every time either of these reports a change, we will need to produce a new estimated journey time, and a new IObservable<TrafficInfo> that will produce a notification when it's time to leave.

为了处理这个问题,系统需要接入可观察的数据源来监测以下变化:用户的位置变化,以及可能影响行程的交通状况变化。每当任一数据源检测到变化时,系统都将重新计算预估行程时间,并生成新的 IObservable<TrafficInfo> 实例 —— 该可观察对象会在最新的"出发时间"触发通知。

Every time we revise our estimate, we want to abandon the previously created IObservable<TrafficInfo>. (Otherwise, the user will receive a bewildering number of notifications telling them to leave, one for every time we recalculated the journey time.) We just want to use the latest one. And that's exactly what Switch does.

每次重新估算行程时间时,我们都需要弃用先前生成的 IObservable<TrafficInfo>实例(否则用户会收到大量令人困惑的出发提示通知 —— 每次重新计算行程时间都会生成一个独立通知)。我们的核心需求是始终采用最新生成的实例。而这正是 Switch 运算符的职责所在—— 它能自动切换至最新的可观察数据流,确保只响应最新的行程计算结果。

You can see the example for that scenario in the Reaqtor repo. Here, I'm going to present a different, simpler scenario: live searches. As you type, the text is sent to a search service and the results are returned to you as an observable sequence. Most implementations have a slight delay before sending the request so that unnecessary work does not happen. Imagine I want to search for "Intro to Rx". I quickly type in "Into to" and realize I have missed the letter 'r'. I stop briefly and change the text to "Intro ". By now, two searches have been sent to the server. The first search will return results that I do not want. Furthermore, if I were to receive data for the first search merged together with results for the second search, it would be a very odd experience for the user. I really only want results corresponding to the latest search text. This scenario fits perfectly with the Switch method.

你可以在Reaqtor代码库中找到该场景的示例。在此,我将展示另一个更简单的场景:实时搜索。当用户输入文字时,输入的文本会被实时发送至搜索服务,搜索结果以可观察序列的形式返回。大多数实现会在发送请求前设置短暂延迟,以避免不必要的请求。例如,假设我想搜索 "Intro to Rx",快速输入了"Into to"后,突然意识到漏掉了字母'r'。于是暂停片刻,将文本修改为"Intro "。此时,系统已向服务器发送了两次搜索请求。第一次搜索返回的结果显然不符合需求。更糟糕的是,如果用户界面同时展示第一次和第二次搜索的结果,将会造成混乱的体验。我们真正需要的,是始终仅展示与最新搜索文本对应的结果 —— 该场景正是 Switch 方法的完美应用场景。

In this example, there is an IObservable<string> source that represents the search text—each new value the user types emerges from this source sequence. We also have a search function that produces a single search result for a given search term:

在这个例子中,有一个IObservable<string>源,它表示搜索文本——用户输入的每个新值都从这个源序列中产生。我们还有一个搜索函数,可为给定搜索词生成单个搜索结果:

private IObservable<string> SearchResults(string query)
{
    ...
}

This returns just a single value, but we model it as an IObservable<string> partly to deal with the fact that it might take some time to perform the search, and also to be enable to use it with Rx. We can take our source of search terms, and then use Select to pass each new search value to this SearchResults function. This creates our resulting nested sequence, IObservable<IObservable<string>>.

虽然这里返回的只是一个单独的值,但我们将其建模为IObservable<string>,部分原因是为了处理执行搜索可能需要一些时间的实际情况,同时也是为了能够与Rx配合使用。我们可以接收搜索关键词的输入源,然后通过 Select 运算符将每个新的搜索值传递给这个 SearchResults 函数。这样就创建出了一个嵌套的序列结构——IObservable<IObservable<string>>

Suppose we were to then use Merge to process the results:

假设我们接下来使用 Merge 来处理这些结果:

IObservable<string> searchValues = ....;
IObservable<IObservable<string>> search = searchValues.Select(searchText => SearchResults(searchText));
                    
var subscription = search
    .Merge()
    .Subscribe(Console.WriteLine);

If we were lucky and each search completed before the next element from searchValues was produced, the output would look sensible. However, it is much more likely, however that multiple searches will result in overlapped search results. This marble diagram shows what the Merge function could do in such a situation.

如果我们足够幸运,并且每次搜索都能在下一个searchValues元素产生前完成,那么输出结果看起来会很合理。然而,更有可能的情况是,多个搜索会导致返回的搜索结果出现重叠。这张弹珠图展示了Merge函数在此类场景中可能的效果。

Note how the values from the search results are all mixed together. The fact that some search terms took longer to get a search result than others has also meant that they have come out in the wrong order. This is not what we want. If we use the Switch extension method we will get much better results. Switch will subscribe to the outer sequence and as each inner sequence is yielded it will subscribe to the new inner sequence and dispose of the subscription to the previous inner sequence. This will result in the following marble diagram:

注意,来自不同搜索的结果值会被全部混杂在一起。某些搜索词获取结果的时间比其他词更长,这也导致了它们的返回顺序出现了错乱。这显然不符合我们的预期。如果我们改用Switch扩展方法,结果将得到显著改善。Switch会订阅外层序列,每当新的内层序列产生时,它会立即订阅这个新序列,并取消对前一个内层序列的订阅。这将产生如下弹珠图所示的效果:

Now, each time a new search term arrives, causing a new search to be kicked off, a corresponding new IObservable<string> for that search's results appears, causing Switch to unsubscribe from the previous results. This means that any results that arrive too late (i.e., when the result is for a search term that is no longer the one in the search box) will be dropped. As it happens, in this particular example, this means that we only see the result for the final search term. All the intermediate values that we saw as the user was typing didn't hang around for long, because the user kept on pressing the next key before we'd received the previous value's results. Only at the end, when the user stopped typing for long enough that the search results came back before they became out of date, do we finally see a value from Switch. The net effect is that we've eliminated confusing results that are out of date.

每当有新的搜索词到达并触发新的搜索时,就会生成对应的新IObservable<string>结果流。此时Switch会自动取消订阅前一个结果流。这意味着任何延迟到达的结果(即对应搜索框中已被替换的旧搜索词)都将被丢弃。在这个具体案例中,最终我们只会看到最后一个搜索词的结果。用户持续输入期间产生的中间值之所以不会残留,是因为每次按键操作都赶在前一次搜索结果返回之前发生。只有当用户最终停止输入,且搜索结果能在过时之前返回时,Switch才会输出有效结果。这种机制从根本上消除了过期结果造成的混乱。

This is another diagram where the ambiguity of marble diagrams causes a slight issue. I've shown each of the single-value observables produced by each of the calls to SearchResults, but in practice Switch unsubscribes from all but the last of these before they've had a chance to produce a value. So this diagram is showing the values those sources could potentially produce, and not the values that they actually delivered as part of the subscription, because the subscriptions were cut short.

这个弹珠图示例还体现了另一方面:其固有的表达局限性会带来些许理解上的困惑。图中虽然展示了每次调用SearchResults生成的单值可观察序列(IObservable<string>),但实际情况是:Switch会在这些序列尚未产生值之前就取消对所有旧序列的订阅(仅保留最新一个)。因此,图中显示的是这些源序列理论上可能产生的值,而非它们在实际订阅周期内真正传递的值——因为这些订阅被提前取消了。

Pairing sequences    配对序列

The previous methods allowed us to flatten multiple sequences sharing a common type into a result sequence of the same type (with various strategies for deciding what to include and what to discard). The operators in this section still take multiple sequences as an input, but attempt to pair values from each sequence to produce a single value for the output sequence. In some cases, they also allow you to provide sequences of different types.

之前的方法允许我们将多个同类型序列"扁平化"为同类型的结果序列(采用不同策略决定包含或丢弃哪些元素)。而本节的运算符虽然同样接收多个序列作为输入,但其核心机制是尝试将各个序列中的值进行配对,以生成输出序列中的单个值。某些运算符还支持处理不同类型的输入序列。

Zip

Zip combines pairs of items from two sequences. So its first output is created by combining the first item from one input with the first item from the other. The second output combines the second item from each input. And so on. The name is meant to evoke a zipper on clothing or a bag, which brings the teeth on each half of the zipper together one pair at a time.

Zip运算符将两个序列中的元素按顺序逐一进行组合。其第一个输出值由第一个输入序列的首项与第二个输入序列的首项组合而成,第二个输出则由两者的第二项组合产生,依此类推。该运算符的命名灵感来源于衣物或包袋上的拉链——将两边的齿牙逐一啮合,每次只处理一对齿牙。

Since Zip combines pairs of item in strict order, it will complete when the first of the sequences complete. If one of the sequence has reached its end, then even if the other continues to emit values, there will be nothing to pair any of these values with, so Zip just unsubscribes at this point, discards the unpairable values, and reports completion.

Zip运算符由于需要严格按顺序配对元素,当任一输入序列率先完成时,整个Zip流就会完成。此时,即使另一个序列仍在持续产生新值,这些值也会因失去配对对象而被丢弃,Zip会立即取消订阅所有输入流,并向上游发出完成信号。

If either of the sequences produces an error, the sequence returned by Zip will report that same error.

若任一输入序列产生错误,Zip返回的序列将立即报告同样的错误。

If one of the source sequences publishes values faster than the other sequence, the rate of publishing will be dictated by the slower of the two sequences, because it can only emit an item when it has one from each source.

当其中一个源序列发布值的速度快于另一个时,Zip运算符的整体发布速率将由较慢的序列决定。因为它必须严格等待每个源都提供对应的值后,才能生成并发射一个组合项。

Here's an example:

这里是一个例子:

// Generate values 0,1,2 
var nums = Observable.Interval(TimeSpan.FromMilliseconds(250))
    .Take(3);

// Generate values a,b,c,d,e,f 
var chars = Observable.Interval(TimeSpan.FromMilliseconds(150))
    .Take(6)
    .Select(i => Char.ConvertFromUtf32((int)i + 97));

// Zip values together
nums.Zip(chars, (lhs, rhs) => (lhs, rhs)))
    .Dump("Zip");

The effect can be seen in this marble diagram below.:

下面的弹珠图展示了这一效果:

Here's the actual output of the code:

这里是代码的实际输出:

{ Left = 0, Right = a }
{ Left = 1, Right = b }
{ Left = 2, Right = c }

Note that the nums sequence only produced three values before completing, while the chars sequence produced six values. The result sequence produced three values, this was as many pairs is it could make.

注意,nums序列在完成前只产生了3个值,而chars序列则产生了6个值。结果序列最终生成了3个值——这是两个序列能够配对的最大数量。

It is also worth noting that Zip has a second overload that takes an IEnumerable<T> as the second input sequence.

Zip运算符还存在一个重载版本,它接受一个 IEnumerable<T> 作为第二个输入序列。

// Merges an observable sequence and an enumerable sequence into one observable sequence 
// containing the result of pair-wise combining the elements by using the selector function.
public static IObservable<TResult> Zip<TFirst, TSecond, TResult>(
    this IObservable<TFirst> first, 
    IEnumerable<TSecond> second, 
    Func<TFirst, TSecond, TResult> resultSelector)
{...}

This allows us to zip sequences from both IEnumerable<T> and IObservable<T> paradigms!

这使我们能够将IEnumerable<T>(拉取模型)与IObservable<T>(推送模型)这两种不同范式的序列进行跨模型配对

SequenceEqual

There's another operator that processes pairs of items from two sources: SequenceEqual. But instead of producing an output for each pair of inputs, this compares each pair, and ultimately produces a single value indicating whether every pair of inputs was equal or not.

有一个处理来自两个源的项目对的运算符: SequenceEqual。但与为每一对输入生成输出不同,它会比较每一对输入,并最终生成一个单一的值,指示所有输入对是否都相等。

In the case where the sources produce different values, SequenceEqual produces a single false value as soon as it detects this. But if the sources are equal, it can only report this when both have completed because until that happens, it doesn't yet know if there might a difference coming later. Here's an example illustrating its behaviour:

当两个数据源产生的值不同时, SequenceEqual 一旦检测到差异就会立即返回一个 false 值。但如果数据源内容相同,它只能在两者都完成后才能确认这一点——因为在两者完成之前,它无法确定后续是否会出现差异。以下是一个演示其行为的示例:

var subject1 = new Subject<int>();

subject1.Subscribe(
    i => Console.WriteLine($"subject1.OnNext({i})"),
    () => Console.WriteLine("subject1 completed"));

var subject2 = new Subject<int>();

subject2.Subscribe(
    i => Console.WriteLine($"subject2.OnNext({i})"),
    () => Console.WriteLine("subject2 completed"));

var areEqual = subject1.SequenceEqual(subject2);

areEqual.Subscribe(
    i => Console.WriteLine($"areEqual.OnNext({i})"),
    () => Console.WriteLine("areEqual completed"));

subject1.OnNext(1);
subject1.OnNext(2);

subject2.OnNext(1);
subject2.OnNext(2);
subject2.OnNext(3);

subject1.OnNext(3);

subject1.OnCompleted();
subject2.OnCompleted();

Output:

输出:

subject1.OnNext(1)
subject1.OnNext(2)
subject2.OnNext(1)
subject2.OnNext(2)
subject2.OnNext(3)
subject1.OnNext(3)
subject1 completed
subject2 completed
areEqual.OnNext(True)
areEqual completed

CombineLatest

The CombineLatest operator is similar to Zip in that it combines pairs of items from its sources. However, instead of pairing the first items, then the second, and so on, CombineLatest produces an output any time either of its inputs produces a new value. For each new value to emerge from an input, CombineLatest uses that along with the most recently seen value from the other input. (To be precise, it doesn't produce anything until each input has produced at least one value, so if one input takes longer to get started than the other, there will be a period in which CombineLatest doesn't in fact produce an output each time one of its inputs does, because it's waiting for the other to produce its first value.) The signature is as follows.

 CombineLatest 操作符与 Zip 类似,都能合并来自两个数据源的项对。但与 Zip 按顺序(第一项配对、第二项配对等)组合不同, CombineLatest 会在任一输入源产生新值时立即生成输出。每当一个输入源产生新值时, CombineLatest 会将该值与另一输入源最近接收到的值组合。(准确地说,它会在每个输入源至少产生一个值后才开始输出。因此,如果一个输入源的启动时间比另一个长,在等待另一输入源产生首个值时,即使其中一个输入源持续产生值, CombineLatest 也会暂时处于静默状态。)其定义如下:

// Composes two observable sequences into one observable sequence by using the selector 
// function whenever one of the observable sequences produces an element.
public static IObservable<TResult> CombineLatest<TFirst, TSecond, TResult>(
    this IObservable<TFirst> first, 
    IObservable<TSecond> second, 
    Func<TFirst, TSecond, TResult> resultSelector)
{...}

The marble diagram below shows off usage of CombineLatest with one sequence that produces numbers, and the other letters (s2). If the resultSelector function just joins the number and letter together as a pair, this would produce the result shown on the bottom line. I've colour coded each output to indicate which of the two sources caused it to emit that particular result, but as you can see, each output includes a value from each source.

下方的弹珠图展示了 CombineLatest 运算符的用法:一个数据流 (s1)生成数字,另一个数据流 (s2)生成字母。若 resultSelector 简单地将数字与字母组合成对,则会生成底部线条所示的结果。我通过颜色标记了每个输出项,以表明是哪个数据源触发了该次结果生成。但如你所见,每个输出项始终包含来自两个数据源的值。

If we slowly walk through the above marble diagram, we first see that s2 produces the letter 'a'. s1 has not produced any value yet so there is nothing to pair, meaning that no value is produced for the result. Next, s1 produces the number '1' so the result sequence can now produce a pair '1,a'. We then receive the number '2' from s1. The last letter is still 'a' so the next pair is '2,a'. The letter 'b' is then produced creating the pair '2,b', followed by 'c' giving '2,c'. Finally the number 3 is produced and we get the pair '3,c'.

让我们逐步解析上述弹珠图的过程:首先, s2 生成了字母 'a',此时 s1 尚未生成任何值,因此没有可配对的内容,结果序列暂不输出。接着, s1 生成了数字 '1',此时两个数据源均有值,结果序列生成配对 '1,a'。随后, s1 又生成数字 '2',由于 s2 的最新值仍为 'a',因此生成配对 '2,a'。当 s2 生成字母 'b' 时,与 s1 的最新值 '2' 组合,得到配对 '2,b';同理, s2 生成 'c' 时生成配对 '2,c'。最后, s1 生成数字 '3',与 s2 的最新值 'c' 结合,最终输出配对 '3,c'。

This is great in case you need to evaluate some combination of state which needs to be kept up-to-date when any single component of that state changes. A simple example would be a monitoring system. Each service is represented by a sequence that returns a Boolean indicating the availability of said service. The monitoring status is green if all services are available; we can achieve this by having the result selector perform a logical AND. Here is an example.

当需要实时评估某个组合状态(且该状态需在任何单个组件发生变化时保持更新)时, CombineLatest 非常适用。一个典型场景是监控系统:每个服务对应一个返回布尔值的序列(表示该服务是否可用)。如果所有服务均可用,则监控状态显示为绿色——这可以通过 resultSelector 执行逻辑与(AND)操作来实现。示例如下:

IObservable<bool> webServerStatus = GetWebStatus();
IObservable<bool> databaseStatus = GetDBStatus();

// Yields true when both systems are up.
var systemStatus = webServerStatus
    .CombineLatest(
        databaseStatus,
        (webStatus, dbStatus) => webStatus && dbStatus);

You may have noticed that this method could produce a lot of duplicate values. For example, if the web server goes down the result sequence will yield 'false'. If the database then goes down, another (unnecessary) 'false' value will be yielded. This would be an appropriate time to use the DistinctUntilChanged extension method. The corrected code would look like the example below.

你可能会注意到,这种方法可能产生大量重复值。例如,如果 Web 服务器宕机,结果序列将产生 'false';如果数据库随后也宕机,又会生成另一个(冗余的) 'false'值。此时正是使用  DistinctUntilChanged 扩展方法的合适场景。修正后的代码示例如下:

// Yields true when both systems are up, and only on change of status
var systemStatus = webServerStatus
    .CombineLatest(
        databaseStatus,
        (webStatus, dbStatus) => webStatus && dbStatus)
    .DistinctUntilChanged();

Join

The Join operator allows you to logically join two sequences. Whereas the Zip operator would pair values from the two sequences based on their position within the sequence, the Join operator allows you join sequences based on when elements are emitted.

 Join 运算符允许您在逻辑上连接两个序列。与 Zip 运算符(根据元素在序列中的位置进行配对)不同, Join 运算符基于元素的发射时机进行序列连接。

Since the production of a value by an observable source is logically an instantaneous event, joins use a model of intersecting windows. Recall that with the Window operator, you can define the duration of each window using an observable sequence. The Join operator uses a similar concept: for each source, we can define a time window over which each element is considered to be 'current' and two elements from different sources will be joined if their time windows overlap. As the Zip operator, we also need to provide a selector function to produce the result item from each pair of values. Here's the Join operator:

由于从逻辑上讲,可观察源生成值是一个瞬时事件,因此 Join 运算符采用了一种窗口相交模型。回顾 Window 运算符,您可以通过一个可观察序列来定义每个窗口的持续时间。 Join  运算符采用了类似的概念:对于每个数据源中的元素,我们可以定义一个时间窗口,在此期间该元素被视为"当前有效"。当两个不同源中的元素时间窗口存在重叠时,它们将被连接。与  Zip 运算符类似,我们仍需提供一个选择器函数( resultSelector ),用于根据每对值生成结果项。以下是 Join 运算符的定义:

public static IObservable<TResult> Join<TLeft, TRight, TLeftDuration, TRightDuration, TResult>
(
    this IObservable<TLeft> left,
    IObservable<TRight> right,
    Func<TLeft, IObservable<TLeftDuration>> leftDurationSelector,
    Func<TRight, IObservable<TRightDuration>> rightDurationSelector,
    Func<TLeft, TRight, TResult> resultSelector
)

This is a complex signature to try and understand in one go, so let's take it one parameter at a time.

这个方法的签名较为复杂,很难一次就理解清楚,因此我们不妨逐个参数拆解分析。

IObservable<TLeft> left is the first source sequence. IObservable<TRight> right is the second source sequence. Join is looking to produce pairs of items, with each pair containing one element from left and one element from right.

IObservable<TLeft> left 是第一个源序列, IObservable<TRight> right 是第二个源序列。 Join 运算符旨在生成由成对项组成的结果,其中每一对都包含来自 left 的一个元素和 right的一个元素。

The leftDurationSelector argument enables us to define the time window for each item from left. A source item's time window begins when the source emits the item. To determine when the window for an item from left should close, Join will invoke the leftDurationSelector, passing in the value just produced by left. This selector must return an observable source. (It doesn't matter at all what the element type of this source is, because Join is only interested in when it does things.) The item's time window ends as soon as the source returned for that item by leftDurationSelector either produces a value or completes.

 leftDurationSelector 参数使我们能够定义来自左序列( left)的每个项的时间窗口。一个源项的时间窗口从该源( left)发射该项时开始。为了确定来自左序列的某个项的时间窗口何时关闭, Join 会调用 leftDurationSelector 函数,并传入 left刚产生的项的值。此选择器必须返回一个可观察源。(该源的元素类型无关紧要,因为 Join 只关心它的行为时机。)该时间窗口将在 leftDurationSelector 为此项返回的源产生一个值或完成时立即结束。

The rightDurationSelector argument defines the time window for each item from right. It works in exactly the same way as the leftDurationSelector.

 rightDurationSelector 参数用于定义来自右序列( right)的每个项的时间窗口。其工作方式与 leftDurationSelector完全相同。

Initially, there are no current items. But as left and right produce items, these items' windows will start, so Join might have multiple items all with their windows currently open. Each time left produces a new item, Join looks to see if any items from right still have their windows open. If they do, left is now paired with each of them. (So a single item from one source might be joined with multiple items from the other source.) Join calls the resultSelector for each such pairing. Likewise, each time right produces an item, then if there are any currently open windows for items from left, that new item from right will be paired with each of these, and again, resultSelector will be called for each such pairing.

最初,没有处于窗口期的当前项。但随着左序列 left 和右序列 right 不断产生项,这些项的窗口期开始生效,因此Join 运算符可能同时管理多个处于开放窗口期的项。每当左序列 left 产生一个新项时,Join 会检查右序列 right 中是否存在仍处于窗口开放期的项。如果存在,左序列 left 的新项将与右序列 right 中所有处于开放窗口期的项逐一配对。(因此,来自一个源的单个项可能与另一源的多个项进行连接。)对于每对匹配项,Join 都会调用 resultSelector 函数生成结果。同理,当右序列 right 产生新项时,如果左序列 left 中存在处于窗口开放期的项,右序列 right 的新项也会与这些项逐一配对,并触发 resultSelector 的调用。

The observable returned by Join produces the result of each call to resultSelector.

由 Join 运算符返回的可观察对象会产生每次调用 resultSelector 函数所产生的结果。

Let us now imagine a scenario where the left sequence produces values twice as fast as the right sequence. Imagine that in addition we never close the left windows; we could do this by always returning Observable.Never<Unit>() from the leftDurationSelector function. And imagine that we make the right windows close as soon as they possibly can, which we can achieve by making rightDurationSelector return Observable.Empty<Unit>(). The following marble diagram illustrates this:

假设现在有一个场景:左序列(left)生成值的速度是右序列(right)的两倍。同时,我们设定左窗口永不关闭——这可以通过让 leftDurationSelector 始终返回 Observable.Never<Unit>() 来实现。而对于右窗口,我们让其立即关闭,这可以通过让 rightDurationSelector 返回 Observable.Empty<Unit>() 实现。下面的弹珠图展示了此场景的运行过程:

Each time a left duration window intersects with a right duration window, we get an output. The right duration windows are all effectively of zero length, but this doesn't stop them from intersecting with the left duration windows, because those all never end. So the first item from right has a (zero-length) window that falls inside two of the windows for the left items, and so Join produces two results. I've stacked these vertically on the diagram to show that they happen at virtually the same time. Of course, the rules of IObserver<T> mean that they can't actually happen at the same time: Join has to wait until the consumer's OnNext has finished processing 0,A before it can go on to produce 1,A. But it will produce all the pairs as quickly as possible any time a single event from one source overlaps with multiple windows for the other.

每当 left 持续时间窗口与 right 持续时间窗口相交时,就会生成一个输出。虽然 right 持续时间窗口的有效长度为零,但这并不妨碍它们与 left 持续时间窗口相交(因为 left 窗口永不关闭)。因此, right 序列的第一个项(如 A )的(零长度)窗口会与 left 序列的两个项(如 0 和 1)的窗口相交,导致 Join 生成两个结果( 0,A 和 1,A)。图中我将这两个结果垂直堆叠,表示它们几乎同时发生。当然,根据 IObserver<T>  的规则,它们实际上无法真正同时触发: Join 必须等待消费者处理完 0,A 的 OnNext 后,才能继续生成 1,A。但每当一个源的事件与另一个源的多个窗口相交时, Join 会尽快生成所有配对的输出。

If I also immediately closed the left window by returning Observable.Empty<Unit>, or perhaps Observable.Return(0), the windows would never overlap, so no pairs would ever get produced. (In theory if both left and right produce items at exactly the same time, then perhaps we might get a pair, but since the timing of events is never absolutely precise, it would be a bad idea to design a system that depended on this.)

若我们通过返回 Observable.Empty<Unit>或 Observable.Return(0)使左窗口也立即关闭,那么左、右窗口将永远无法重叠,因此不会生成任何配对结果。(理论上,如果 left 、 right 序列的项完全同时生成,则可能产生配对;但由于事件的时间精度无法绝对保证,依赖这种巧合设计系统是极不可取的。)

What if I wanted to ensure that items from right only ever intersected with a single value from left? In that case, I'd need to ensure that the left durations did not overlap. One way to do that would be to have my leftDurationSelector always return the same sequence that I passed as the left sequence. This will result in Join making multiple subscriptions to the same source, and for some kinds of sources that might introduce unwanted side effects, but the Publish and RefCount operators provide a way to deal with that, so this is in fact a reasonably strategy. If we do that, the results look more like this.

若需要确保 right 序列的每个项仅与 left序列的一个值相交,则需保证 left序列的各个窗口互不重叠。一种方法是让 leftDurationSelector 始终返回 left序列本身。尽管这会导致 Join 对同一源进行多次订阅(可能引发副作用),但结合使用 Publish 和 RefCount 运算符可以解决此问题。如果我们这样做,结果看起来更像这样。

The last example is very similar to CombineLatest, except that it is only producing a pair when the right sequence changes. We can easily make it work the same way by changing the right durations to work in the same way as the left durations. This code shows how (including the use of Publish and RefCount to ensure that we only get a single subscription to the underlying left and right sources despite providing then to Join many times over).

最后一个示例与 CombineLatest 非常相似,区别在于它仅在 right 序列变化时生成配对。若要让其行为与 CombineLatest 完全一致,只需让 right 序列的持续时间窗口采用与 left 序列相同的设置。以下代码展示了如何实现(包括使用 Publish 和 RefCount 确保即使多次向 Join 提供 left 和 right 序列,也只会对底层数据源进行一次订阅):

public static IObservable<TResult> MyCombineLatest<TLeft, TRight, TResult>
(
    IObservable<TLeft> left,
    IObservable<TRight> right,
    Func<TLeft, TRight, TResult> resultSelector
)
{
    var refcountedLeft = left.Publish().RefCount();
    var refcountedRight = right.Publish().RefCount();

    return Observable.Join(
        refcountedLeft,
        refcountedRight,
        value => refcountedLeft,
        value => refcountedRight,
        resultSelector);
}

Obviously there's no need to write this—you can just use the built-in CombineLatest. (And that will be slightly more efficient because it has a specialized implementation.) But it shows that Join is a powerful operator.

显然,我们无需实际编写这样的代码——直接使用内置的 CombineLatest即可。(由于 CombineLatest有专门的优化实现,其效率也会略高。)但这个例子展示了 Join 运算符的强大灵活性。

GroupJoin

When the Join operator pairs up values whose windows overlap, it will pass the scalar values left and right to the resultSelector. The GroupJoin operator is based on the same concept of overlapping windows, but its selector works slightly differently: GroupJoin still passes a single (scalar) value from the left source, but it passes an IObservable<TRight> as the second argument. This argument represents all of the values from the right sequence that occur within the window for the particular left value for which it was invoked.

 GroupJoin 运算符基于相同的窗口重叠概念,但其选择器( resultSelector)的工作方式略有不同: GroupJoin 仍会传递来自 left 序列的单个标量值,但第二个参数是一个  IObservable<TRight> 类型的可观察序列。此参数表示在调用该选择器时,特定 left 值的时间窗口期内, right 序列中出现的所有值。

So this lacks the symmetry of Join, because the left and right sources are handled differently. GroupJoin will call the resultSelector exactly once for each item produced by the left source. When a left value's window overlaps with the windows of multiple right values, Group would deal with that by calling the selector once for each such pairing, but GroupJoin deals with this by having the observable passed as the second argument to resultSelector emit each of the right items that overlap with that left item. (If a left item overlaps with nothing from the right, resultSelector will still be called with that item, it'll just be passed an IObservable<TRight> that doesn't produce any items.)

因此,这与 Join 的对称性不同,因为 left 和 right 数据源的处理方式不同。 GroupJoin 会为 left 源生成的每个项只调用一次resultSelector。当 left 值的时间窗口与多个 right 值的时间窗口重叠时:

  • Group 运算会通过为每个这样的配对组合调用一次选择器来处理;
  • GroupJoin的处理方式是:通过传递给 resultSelector 第二个参数的可观察对象( IObservable<TRight> ),发射所有与该 left 项重叠的 right 项。

(如果左项没有与任何右项重叠,resultSelector仍会被调用,只是传递的IObservable<TRight>不会产生任何项。)

The GroupJoin signature is very similar to Join, but note the difference in the resultSelector parameter.

 GroupJoin 的方法签名与 Join非常相似,但需注意两者的 resultSelector 参数存在差异。

public static IObservable<TResult> GroupJoin<TLeft, TRight, TLeftDuration, TRightDuration, TResult>
(
    this IObservable<TLeft> left,
    IObservable<TRight> right,
    Func<TLeft, IObservable<TLeftDuration>> leftDurationSelector,
    Func<TRight, IObservable<TRightDuration>> rightDurationSelector,
    Func<TLeft, IObservable<TRight>, TResult> resultSelector
)

If we went back to our first Join example where we had

如果我们回到之前第一个 Join 示例(即我们曾演示过...)

  • the left producing values twice as fast as the right,     left 数据源产生值的速度是 right 数据源的两倍,
  • the left never expiring    left 数据源的值永远不会过期,
  • the right immediately expiring     right 数据源的值立即过期

This diagram shows those same inputs again, and also shows the observables GroupJoin would pass to the resultSelector for each of the items produced by left:

该图表再次展示了这些相同的输入,并显示了 GroupJoin 针对 left源生成的每个项会传递给 resultSelector 的可观察对象。

This produces events corresponding to all of the same events that Join produced, they're just distributed across six different IObservable<TRight> sources. It may have occurred to you that with GroupJoin you could effectively re-create your own Join method by doing something like this:

这将生成与Join所产生的事件相对应的事件,只是这些事件被分布到了六个不同的 IObservable<TRight> 源中。您可能已经想到,通过使用 GroupJoin ,实际上可以通过类似以下方式有效地重新实现自己的 Join 方法:

public IObservable<TResult> MyJoin<TLeft, TRight, TLeftDuration, TRightDuration, TResult>(
    IObservable<TLeft> left,
    IObservable<TRight> right,
    Func<TLeft, IObservable<TLeftDuration>> leftDurationSelector,
    Func<TRight, IObservable<TRightDuration>> rightDurationSelector,
    Func<TLeft, TRight, TResult> resultSelector)
{
    return Observable.GroupJoin
    (
        left,
        right,
        leftDurationSelector,
        rightDurationSelector,
        (leftValue, rightValues) => 
          rightValues.Select(rightValue=>resultSelector(leftValue, rightValue))
    )
    .Merge();
}

You could even create a crude version of Window with this code:

您甚至可以通过以下代码创建一个简化的 Window 方法版本:

public IObservable<IObservable<T>> MyWindow<T>(IObservable<T> source, TimeSpan windowPeriod)
{
    return Observable.Create<IObservable<T>>(o =>
    {
        var sharedSource = source
            .Publish()
            .RefCount();

        var intervals = Observable.Return(0L)
            .Concat(Observable.Interval(windowPeriod))
            .TakeUntil(sharedSource.TakeLast(1))
            .Publish()
            .RefCount();

        return intervals.GroupJoin(
                sharedSource, 
                _ => intervals, 
                _ => Observable.Empty<Unit>(), 
                (left, sourceValues) => sourceValues)
            .Subscribe(o);
    });
}

Rx delivers yet another way to query data in motion by allowing you to interrogate sequences of coincidence. This enables you to solve the intrinsically complex problem of managing state and concurrency while performing matching from multiple sources. By encapsulating these low level operations, you are able to leverage Rx to design your software in an expressive and testable fashion. Using the Rx operators as building blocks, your code effectively becomes a composition of many simple operators. This allows the complexity of the domain code to be the focus, not the otherwise incidental supporting code.

Rx通过允许您查询关联事件序列,提供了另一种处理动态数据的方式。这使您能够解决管理状态与并发这一本质复杂的难题,同时实现对多源事件的匹配。通过封装这些底层操作,您可利用Rx以声明式且易于测试的方式设计软件。将Rx运算符作为构建块,您的代码实质上成为多个简单运算符的组合。这使得领域逻辑的复杂性成为关注焦点,而非那些原本繁琐的辅助代码。

And-Then-When

Zip can take only two sequences as an input. If that is a problem, then you can use a combination of the three And/Then/When methods. These methods are used slightly differently from most of the other Rx methods. Out of these three, And is the only extension method to IObservable<T>. Unlike most Rx operators, it does not return a sequence; instead, it returns the mysterious type Pattern<T1, T2>. The Pattern<T1, T2> type is public (obviously), but all of its properties are internal. The only two (useful) things you can do with a Pattern<T1, T2> are invoking its And or Then methods. The And method called on the Pattern<T1, T2> returns a Pattern<T1, T2, T3>. On that type, you will also find the And and Then methods. The generic Pattern types are there to allow you to chain multiple And methods together, each one extending the generic type parameter list by one. You then bring them all together with the Then method overloads. The Then methods return you a Plan type. Finally, you pass this Plan to the Observable.When method in order to create your sequence.

Zip 方法只能接受两个序列作为输入。如果这成为一个问题,你可以使用 And/Then/When 这三个方法的组合。这些方法的使用方式与大多数其他 Rx 方法略有不同。在这三个方法中, And  是唯一一个对 IObservable<T>的扩展方法。与大多数 Rx 运算符不同,它不返回序列,而是返回一个神秘的类型 Pattern<T1, T2>。 Pattern<T1, T2>类型是公开的(显然),但其所有属性都是内部的。对于 Pattern<T1, T2>,你唯一能做的两件(有用)事情是调用它的 And 或 Then 方法。

在 Pattern<T1, T2> 上调用 And 方法会返回一个 Pattern<T1, T2, T3>。在该类型上,你同样会找到 And 和 Then 方法。这些泛型 Pattern 类型的存在允许你将多个 And 方法链式调用,每个 And 调用会将泛型类型参数列表扩展一个类型参数。然后你可以通过 Then 方法的重载将它们全部组合起来。 Then 方法会返回一个 Plan 类型。最后,你需要将这个 Plan 传递给  Observable.When 方法以创建你的序列。

It may sound very complex, but comparing some code samples should make it easier to understand. It will also allow you to see which style you prefer to use.

这听起来可能很复杂,但是通过比较一些代码示例应该会让它更容易理解。同时,这也将让你看到自己更喜欢使用哪种风格。

To Zip three sequences together, you can either use Zip methods chained together like this:

要将三个序列合并在一起,你可以使用链式调用的 Zip 方法,如下所示:

IObservable<long> one = Observable.Interval(TimeSpan.FromSeconds(1)).Take(5);
IObservable<long> two = Observable.Interval(TimeSpan.FromMilliseconds(250)).Take(10);
IObservable<long> three = Observable.Interval(TimeSpan.FromMilliseconds(150)).Take(14);

// lhs represents 'Left Hand Side'
// rhs represents 'Right Hand Side'
IObservable<(long One, long Two, long Three)> zippedSequence = one
    .Zip(two, (lhs, rhs) => (One: lhs, Two: rhs))
    .Zip(three, (lhs, rhs) => (lhs.One, lhs.Two, Three: rhs));

zippedSequence.Subscribe(
    v => Console.WriteLine($"One: {v.One}, Two: {v.Two}, Three: {v.Three}"),
    () => Console.WriteLine("Completed"));

Or perhaps use the nicer syntax of the And/Then/When:

或者,你也可以使用更优雅的 And/Then/When 语法:

Pattern<long, long, long> pattern =
    one.And(two).And(three);
Plan<(long One, long Two, long Three)> plan =
    pattern.Then((first, second, third) => (One: first, Two: second, Three: third));
IObservable<(long One, long Two, long Three)> zippedSequence = Observable.When(plan);

zippedSequence.Subscribe(
    v => Console.WriteLine($"One: {v.One}, Two: {v.Two}, Three: {v.Three}"),
    () => Console.WriteLine("Completed"));

This can be further reduced, if you prefer, to:

如果你愿意,这还可以进一步简化为:

IObservable<(long One, long Two, long Three)> zippedSequence = Observable.When(
    one.And(two).And(three)
        .Then((first, second, third) =>
            (One: first, Two: second, Three: third))
    );  

zippedSequence.Subscribe(
    v => Console.WriteLine($"One: {v.One}, Two: {v.Two}, Three: {v.Three}"),
    () => Console.WriteLine("Completed"));

The And/Then/When trio has more overloads that enable you to group an even greater number of sequences. They also allow you to provide more than one 'plan' (the output of the Then method). This gives you the Merge feature but on the collection of 'plans'. I would suggest playing around with them if this functionality is of interest to you. The verbosity of enumerating all of the combinations of these methods would be of low value. You will get far more value out of using them and discovering for yourself.

 And/Then/When 这组方法提供了更多重载选项,使您能够对更多数量的序列进行组合。它们还允许您定义多个"plan"(即 Then 方法的输出结果)。这相当于为您提供了对多个"plans"集合进行合并( Merge )的功能。如果对此功能感兴趣,建议您在实际使用中多加尝试。详细列举这些方法的所有组合形式意义不大——与其纸上谈兵,不如在实践中亲自探索,您将从中获得更大的收获。

Summary    总结

This chapter covered a set of methods that allow us to combine observable sequences. This brings us to a close on Part 2. We've looked at the operators that are mostly concerned with defining the computations we want to perform on the data. In Part 3 we will move onto practical concerns such as managing scheduling, side effects, and error handling.

本章我们探讨了一系列用于组合可观察序列的方法。至此,我们完成了第二部分的全部内容。我们研究了主要关注定义数据处理逻辑的运算符。在第三部分中,我们将转向实际应用问题,例如调度管理、副作用处理及错误处理机制。

 

posted @ 2024-08-23 18:29  菜鸟吊思  阅读(29)  评论(0)    收藏  举报