烂翻译系列之Rx.NET介绍第二版——组合序列
Data sources are everywhere, and sometimes we need to consume data from more than just a single source. Common examples that have many inputs include: price feeds, sensor networks, news feeds, social media aggregators, file watchers, multi touch surfaces, heart-beating/polling servers, etc. The way we deal with these multiple stimuli is varied too. We may want to consume it all as a deluge of integrated data, or one sequence at a time as sequential data. We could also get it in an orderly fashion, pairing data values from two sources to be processed together, or perhaps just consume the data from the first source that responds to the request.
数据源无处不在,有时我们需要从多个来源获取数据。具有多输入源的典型场景包括:价格信息流、传感器网络、新闻推送、社交媒体聚合器、文件监控器、多点触控表面、心跳/轮询服务器等。处理这些多重输入源的方式同样灵活多样——既可将所有输入视为集成化的数据洪流统一处理,也能按序列逐个解析数据;既可有序配对两个来源的数据值进行联合处理,亦可仅采用最先响应请求的数据源信息。
Earlier chapters have also shown some examples of the fan out and back in style of data processing, where we partition data, and perform processing on each partition to convert high-volume data into lower-volume higher-value events before recombining. This ability to restructure streams greatly enhances the benefits of operator composition. If Rx only enabled us to apply composition as a simple linear processing chain, it would be a good deal less powerful. Being able to pull streams apart gives us much more flexibility. So even when there is a single source of events, we often still need to combine multiple observable streams as part of our processing. Sequence composition enables you to create complex queries across multiple data sources. This unlocks the possibility to write some very powerful yet succinct code.
前几章已展示过“发散与回拢”风格的数据处理示例:先将数据分区,对各分区进行处理以将海量数据转化为更精简但价值更高的事件,最后重新整合。这种对流进行重组的能力,极大提升了运算符组合的优势。若Rx仅支持简单的线性处理链组合,其功能将大幅受限。通过解构数据流,我们获得了更强大的灵活性。因此即便面对单一事件源,处理过程中也常需组合多个可观测流。“序列组合”使开发者能在多个数据源间构建复杂查询,为实现高效而简洁的代码开辟了可能性。
We've already used SelectMany
in earlier chapters. This is one of the fundamental operators in Rx. As we saw in the Transformation chapter, it's possible to build several other operators from SelectMany
, and its ability to combine streams is part of what makes it powerful. But there are several more specialized combination operators available, which make it easier to solve certain problems than it would be using SelectMany
. Also, some operators we've seen before (including TakeUntil
and Buffer
) have overloads we've not yet explored that can combine multiple sequences.
在前几章中,我们已经使用过 SelectMany
运算符。作为Rx的基础运算符之一,正如“转换”章节所示,许多其他运算符都可通过 SelectMany
构建而成,其整合数据流的能力正是其强大之处的一部分。但Rx还提供了其他一些“专门的组合运算符”,能更便捷地解决特定场景下的问题。此外,先前章节涉及的 TakeUntil
、 Buffer
等运算符也存在尚未探讨的重载版本,具备跨序列组合能力。
Sequential Combination 顺序组合
We'll start with the simplest kind of combining operators, which do not attempt concurrent combination. They deal with one source sequence at a time.
我们将从最为基础的一类组合运算符入手,这类运算符不涉及并发组合,而是每次仅处理单一源序列。
Concat 连接运算
Concat
is arguably the simplest way to combine sequences. It does the same thing as its namesake in other LINQ providers: it concatenates two sequences. The resulting sequence produces all of the elements from the first sequence, followed by all of the elements from the second sequence. The simplest signature for Concat
is as follows.
Concat
可以说是组合序列的最简方式,其行为与其他LINQ提供程序中的同名运算符一致:将两个序列首尾相连。生成的序列会先完整产出第一个序列的所有元素,随后再生成第二个序列的全部元素。Concat
最简单的签名如下:
public static IObservable<TSource> Concat<TSource>( this IObservable<TSource> first, IObservable<TSource> second)
Since Concat
is an extension method, we can invoke it as a method on any sequence, passing the second sequence in as the only argument:
由于Concat
是一个扩展方法,我们可以在任意序列对象上直接以方法形式调用它,并将第二个序列作为唯一参数传入:
IObservable<int> s1 = Observable.Range(0, 3); IObservable<int> s2 = Observable.Range(5, 5); IObservable<int> c = s1.Concat(s2); IDisposable sub = c.Subscribe(Console.WriteLine, x => Console.WriteLine("Error: " + x));
This marble diagram shows the items emerging from the two sources, s1
and s2
, and how Concat
combines them into the result, c
:
下方的弹珠图演示了源序列 s1
与 s1
各自产生的数据项,以及 Concat
运算符如何将它们合并为结果序列 c
:
Rx's Concat
does nothing with its sources until something subscribes to the IObservable<T>
it returns. So in this case, when we call Subscribe
on c
(the source returned by Concat
) it will subscribe to its first input, s1
, and each time that produces a value, the c
observable will emit that same value to its subscriber. If we went on to call sub.Dispose()
before s1
completes, Concat
would unsubscribe from the first source, and would never subscribe to s2
. If s1
were to report an error, c
would report that same error to is subscriber, and again, it will never subscribe to s2
. Only if s1
completes will the Concat
operator subscribe to s2
, at which point it will forward any items that second input produces until either the second source completes or fails, or the application unsubscribes from the concatenated observable.
Rx的 Concat
运算符在没有任何订阅者订阅其返回的 IObservable<T>
之前,不会对其源序列进行任何动作。在这个例子中,当我们调用由 Concat
返回的 c
(即合并后的可观察序列)的 Subscribe
方法时,它将首先订阅第一个输入源 s1
。每当 s1
产生一个值时,合并后的 c
可观察序列会将该值传递给它的订阅者。如果我们选择在 s1
尚未完成前调用 sub.Dispose()
取消订阅, Concat
会从第一个源 s1
取消订阅,且永远不会订阅第二个源 s2
。若 s1
抛出错误, c
会将同样的错误传递给它的订阅者,这种情况下同样不会订阅 s2
。只有当 s1
正常完成(Complete)后, Concat
运算符才会订阅 s2
,此时它将转发第二个输入源产生的所有数据项,直到第二个源完成或发生错误,或者应用程序主动取消对这个合并后可观察序列的订阅。
Although Rx's Concat
has the same logical behaviour as the LINQ to Objects Concat
, there are some Rx-specific details to be aware of. In particular, timing is often more significant in Rx than with other LINQ implementations. For example, in Rx we distinguish between hot and cold source. With a cold source it typically doesn't matter exactly when you subscribe, but hot sources are essentially live, so you only get notified of things that happen while you are subscribed. This can mean that hot sources might not be a good fit with Concat
The following marble diagram illustrates a scenario in which this produces results that have the potential to surprise:
尽管Rx的Concat
运算符与LINQ to Objects中的Concat
在逻辑行为上一致,但Rx存在一些特有的细节需要注意。尤其是,在Rx中,时序往往比其他LINQ实现更为关键。例如,Rx中会区分冷源(cold source)和热源(hot source)。对于冷源而言,订阅的具体时机通常无关紧要,但热源本质上是实时推送的,因此你只会收到订阅期间发生的事件通知。这意味着,热源可能并不适合与Concat
结合使用。下面的弹珠图展示了一个可能令人意外的场景:
Since Concat
doesn't subscribe to its second input until the first has finished, it won't see the first couple of items that the hot
source would deliver to any subscribers that been listening from the start. This might not be the behaviour you would expect: it certainly doesn't look like this concatenated all of the items from the first sequence with all of the items from the second one. It looks like it missed out A
and B
from hot
.
由于Concat
运算符在第一个输入源完成前不会订阅第二个输入源,因此它会错过热源(hot source)在初始阶段向任何"从头开始订阅"的观察者推送的前几个数据项。这种行为可能与你的预期不符:表面上,这似乎并没有将第一个序列的所有数据项与第二个序列的所有数据项完整拼接,反而看起来像是漏掉了热源( hot
)中的 A
和 B
。
Marble Diagram Limitations 弹珠图限制
This last example reveals that marble diagrams gloss over a detail: they show when a source starts, when it produces values, and when it finishes, but they ignore the fact that to be able to produce items at all, an observable source needs a subscriber. If nothing subscribes to an IObservable<T>
, then it doesn't really produce anything. Concat
doesn't subscribe to its second input until the first completes, so arguably instead of the diagram above, it would be more accurate to show this:
最后一个示例揭示了一个常被弹珠图忽略的细节:弹珠图通常只展示源何时启动、何时产生值、何时完成,却忽略了可观察源需要订阅者才能实际生成数据项这一事实。如果没有任何订阅者对IObservable<T>
进行订阅,那么它实际上不会产生任何数据。Concat
运算符在第一个源完成前不会订阅第二个输入源,因此严格来说,相较于上面弹珠图,下面的弹珠图更能准确反映其行为:
This makes it easier to see why Concat
produces the output it does. But since hot
is a hot source here, this diagram fails to convey the fact that hot
is producing items entirely on its own schedule. In a scenario where hot
had multiple subscribers, then the earlier diagram would arguably be better because it correctly reflects every event available from hot
(regardless of however many listeners might be subscribed at any particular moment). But although this convention works for hot sources, it doesn't work for cold ones, which typically start producing items upon subscription. A source returned by Timer
produces items on a regular schedule, but that schedule starts at the instant when subscription occurs. That means that if there are multiple subscriptions, there are multiple schedules. Even if I have just a single IObservable<long>
returned by Observable.Timer
, each distinct subscriber will get items on its own schedule—subscribers receive events at a regular interval starting from whenever they happened to subscribe. So for cold observables, it typically makes sense to use the convention used by this second diagram, in which we're looking at the events received by one particular subscription to a source.
这就更容易理解为什么 Concat
会产生这样的输出结果。但在这个案例中,由于 hot
是一个热源,当前图示未能准确传达一个关键事实: hot
完全按照自己的时间线独立生成数据项。如果存在 hot
拥有多个订阅者的场景,那么前一种图示可能更合适,因为它能正确反映 hot
发出的所有事件(无论在任何特定时刻有多少订阅者在监听)。然而,这种图示惯例虽然适用于热源,却不适用于冷源——因为冷源通常只在订阅时才开始生成数据。以 Timer
返回的冷源为例:它虽然会按照固定时间间隔生成数据项,但这个时间线的起点是订阅发生的时刻。这意味着如果有多个订阅存在,就会产生多个独立的时间线。即使我们只有一个由 Observable.Timer
返回的 IObservable<long>
,每个不同的订阅者都会获得基于自身订阅时刻开始计算的时间线数据——订阅者接收事件的固定时间间隔,起点是他们各自订阅的瞬间。因此,对于冷观察对象而言,采用第二种图示惯例(即关注某个特定订阅接收的事件序列)通常更符合实际逻辑。
Most of the time we can get away with ignoring this subtlety, quietly using whichever convention suits us. To paraphrase Humpty Dumpty: when I use a marble diagram, it means just what I choose it to mean—neither more nor less. But when you're combining hot and cold sources together, there might not be one obviously best way to represent this in a marble diagram. We could even do something like this, where we describe the events that hot
represents separately from the events seen by a particular subscription to hot
.
大多数情况下,我们可以暂时忽略这一细微差别,灵活采用最适合当前场景的图示惯例。借用矮胖子的话来说:“当我使用弹珠图时,它的含义完全由我来定义——不多也不少”。但当你需要将热源(hot observables)与冷源(cold observables)结合在一起时,弹珠图中可能就没有一种绝对最优的表现方式了。我们甚至可以尝试类似这样的做法:“将热源本身的事件与某个特定订阅所观察到的事件分开描述”。
We're using a distinct 'lane' in the marble diagram to represent the events seen by a particular subscription to a source. With this technique, we can also show what would happen if you pass the same cold source into Concat
twice:
在弹珠图中,我们使用独立的“泳道”来表示某个特定订阅所观察到的事件序列。借助这种技术,我们还能展示这样一种场景:“若将同一个冷源(下图红色的源也是冷源)两次传入 Concat
运算符时会发生什么”:
This highlights the fact that that being a cold source, cold
provides items separately to each subscription. We see the same three values emerging from the same source, but at different times.
这段描述突显了冷源的核心特性:作为冷源, cold
会为每个订阅单独提供项。我们看到相同的三个值从同一个源发出,但处于不同的时间点,它们在不同的订阅时间点被生成和传递。
Concatenating Multiple Sources 连接多个源
What if you wanted to concatenate more than two sequences? Concat
has an overload accepting multiple observable sequences as an array. This is annotated with the params
keyword, so you don't need to construct the array explicitly. You can just pass any number of arguments, and the C# compiler will generate the code to create the array for you. There's also an overload taking an IEnumerable<IObservable<T>>
, in case the observables you want to concatenate are already in some collection.
若需拼接两个以上的序列该怎么办?Concat
提供了一个重载方法,可直接接收一个由多个可观察序列组成的数组作为参数。由于该参数使用了 params
关键字修饰,你无需显式构造数组——“只需传递任意数量的参数,C# 编译器会自动生成创建数组的代码”。此外,还有一个接受 IEnumerable<IObservable<T>>
的重载版本,适用于待拼接的可观察序列已存在于某个集合中的情况。
public static IObservable<TSource> Concat<TSource>( params IObservable<TSource>[] sources) public static IObservable<TSource> Concat<TSource>( this IEnumerable<IObservable<TSource>> sources)
The IEnumerable<IObservable<T>>
overload evaluates sources
lazily. It won't begin to ask it for source observables until someone subscribes to the observable that Concat
returns, and it only calls MoveNext
again on the resulting IEnumerator<IObservable<T>>
when the current source completes meaning it's ready to start on the text. To illustrate this, the following example is an iterator method that returns a sequence of sequences and is sprinkled with logging. It returns three observable sequences each with a single value [1], [2] and [3]. Each sequence returns its value on a timer delay.
IEnumerable<IObservable<T>>
的重载方法采用延迟求值方式处理数据源。它不会主动请求源可观察对象,直到有订阅者订阅了 Concat
方法返回的可观察对象。并且它只会在当前数据源完成(意味着可以开始处理下一个数据源)时,才会再次调用结果 IEnumerable<IObservable<T>>
的 MoveNext
方法。为了说明这一点,以下示例展示了一个返回嵌套序列的迭代器方法(添加了日志记录)。该方法返回三个可观察序列,每个序列分别包含单个值 [1]、[2] 和 [3]。每个序列都会通过定时器延迟返回其值。
public IEnumerable<IObservable<long>> GetSequences() { Console.WriteLine("GetSequences() called"); Console.WriteLine("Yield 1st sequence"); yield return Observable.Create<long>(o => { Console.WriteLine("1st subscribed to"); return Observable.Timer(TimeSpan.FromMilliseconds(500)) .Select(i => 1L) .Finally(() => Console.WriteLine("1st finished")) .Subscribe(o); }); Console.WriteLine("Yield 2nd sequence"); yield return Observable.Create<long>(o => { Console.WriteLine("2nd subscribed to"); return Observable.Timer(TimeSpan.FromMilliseconds(300)) .Select(i => 2L) .Finally(() => Console.WriteLine("2nd finished")) .Subscribe(o); }); Thread.Sleep(1000); // Force a delay Console.WriteLine("Yield 3rd sequence"); yield return Observable.Create<long>(o => { Console.WriteLine("3rd subscribed to"); return Observable.Timer(TimeSpan.FromMilliseconds(100)) .Select(i=>3L) .Finally(() => Console.WriteLine("3rd finished")) .Subscribe(o); }); Console.WriteLine("GetSequences() complete"); }
We can call this GetSequences
method and pass the results to Concat
, and then use our Dump
extension method to watch what happens:
我们可以调用这个 GetSequences
方法,并将结果传递给 Concat
,然后使用我们的 Dump
扩展方法来观察发生了什么:
GetSequences().Concat().Dump("Concat");
Here's the output:
以下是输出:
GetSequences() called Yield 1st sequence 1st subscribed to Concat-->1 1st finished Yield 2nd sequence 2nd subscribed to Concat-->2 2nd finished Yield 3rd sequence 3rd subscribed to Concat-->3 3rd finished GetSequences() complete Concat completed
Below is a marble diagram of the Concat
operator applied to the GetSequences
method. 's1', 's2' and 's3' represent sequence 1, 2 and 3. Respectively, 'rs' represents the result sequence.
下面是应用于 GetSequences
方法的 Concat
运算符的弹珠图。's1'、's2' 和 's3' 分别表示序列 1、2 和 3。相应地,'rs' 表示结果序列。
You should note that once the iterator has executed its first yield return
to return the first sequence, the iterator does not continue until the first sequence has completed. The iterator calls Console.WriteLine
to display the text Yield 2nd sequence
immediately after that first yield return
, but you can see that message doesn't appear in the output until after we see the Concat-->1
message showing the first output from Concat
, and also the 1st finished
message, produced by the Finally
operator, which runs only after that first sequence has completed. (The code also makes that first source delay for 500ms before producing its value, so that if you run this, you can see that everything stops for a bit until that first source produces its single value then completes.) Once the first source completes, the GetSequences
method continues (because Concat
will ask it for the next item once the first observable source completes). When GetSequences
provides the second sequence with another yield return
, Concat
subscribes to that, and again GetSequences
makes no further progress until that second observable sequence completes. When asked for the third sequence, the iterator itself waits for a second before producing that third and final value, which you can see from the gap between the end of s2
and the start of s3
in the diagram.
需注意:迭代器首次执行 yield return
返回第一个序列后,将暂停执行直到该序列完成。虽然在第一个 yield return
之后立即调用了 Console.WriteLine
显示" Yield 2nd sequence
"信息,但实际输出中该信息直到出现 Concat-->1
(显示 Concat
首个输出)和 1st finished
(由 Finally
运算符产生)后才显示。这是因为 Finally
运算符仅在首个序列完成后运行(代码中还让首个源在产生值前延迟500毫秒,因此运行时可以观察到所有操作都会暂停,直到该源生成单个值并完成)。当首个源完成后, GetSequences
方法才会继续执行(因为 Concat
会在首个可观察源完成后才请求下一项)。当 GetSequences
通过第二个 yield return
提供第二个序列时, Concat
订阅该序列,此时迭代器再次暂停直到第二个可观察序列完成。当请求第三个序列时,迭代器本身会等待1秒才生成第三个最终值,这从图中 s2
结束与 s3
开始之间的间隔可以看出。
Prepend
There's one particular scenario that Concat
supports, but in a slightly cumbersome way. It can sometimes be useful to make a sequence that always emits some initial value immediately. Take the example I've been using a lot in this book, where ships transmit AIS messages to report their location and other information: in some applications you might not want to wait until the ship happens next to transmit a message. You could imagine an application that records the last known location of any vessel. This would make it possible for the application to offer, say, an IObservable<IVesselNavigation>
which instantly reports the last known information upon subscription, and which then goes on to supply any newer messages if the vessel produces any.
虽然 Concat
支持某种特殊场景,但其支持方式略显笨拙。有时我们需要创建能立即发出初始值的序列。以本书多次使用的船舶AIS(自动识别系统)消息传输为例:某些应用场景中,开发者可能不希望被动等待船舶下一次发送消息。设想一个记录所有船只最后已知位置的应用,该应用可以提供 IObservable<IVesselNavigation>
序列——该序列在订阅时立即上报最后已知信息,若船舶后续有新消息产生则继续推送更新。
How would we implement this? We want initially cold-source-like behaviour, but transitioning into hot. So we could just concatenate two sources. We could use Observable.Return
to create a single-element cold source, and then concatenate that with the live stream:
我们应该如何实现这个需求?我们需要创建初始具有冷源特性、但随后能过渡到热源行为的可观察序列。解决方案是连接两个数据源——首先使用 Observable.Return
创建包含单个元素的冷源,然后将其与实时流进行连接:
IVesselNavigation lastKnown = ais.GetLastReportedNavigationForVessel(mmsi); IObservable<IVesselNavigation> live = ais.GetNavigationMessagesForVessel(mmsi); IObservable<IVesselNavigation> lastKnownThenLive = Observable.Concat( Observable.Return(lastKnown), live);
This is a common enough requirement that Rx supplies Prepend
that has a similar effect. We can replace the final line with:
由于这是Rx中常见的需求,框架直接提供了具有类似效果的 Prepend
运算符。我们可以将最后一行代码替换为:
IObservable<IVesselNavigation> lastKnownThenLive = live.Prepend(lastKnown);
This observable will do exactly the same thing: subscribers will immediately receive the lastKnown
, and then if the vessel should emit further navigation messages, they will receive those too. By the way, for this scenario you'd probably also want to ensure that the look up of the "last known" message happens as late as possible. We can delay this until the point of subscription by using Defer
:
该可观察序列将实现完全相同的功能——订阅者会立即收到 lastKnown
值,若船舶后续发出新的导航消息,也会继续接收更新。需要特别说明的是,在此场景中通常还需要确保"最后已知"消息的查询尽可能延迟。我们可以通过Defer
运算符将查询操作推迟到订阅发生时:
public static IObservable<IVesselNavigation> GetLastKnownAndSubsequenceNavigationForVessel(uint mmsi) { return Observable.Defer<IVesselNavigation>(() => { // This lambda will run each time someone subscribes. IVesselNavigation lastKnown = ais.GetLastReportedNavigationForVessel(mmsi); IObservable<IVesselNavigation> live = ais.GetNavigationMessagesForVessel(mmsi); return live.Prepend(lastKnown); } }
StartWith
might remind you of BehaviorSubject<T>
, because that also ensures that consumers receive a value as soon as they subscribe. It's not quite the same: BehaviorSubject<T>
caches the last value its own source emits. You might think that would make it a better way to implement this vessel navigation example. However, since this example is able to return a source for any vessel (the mmsi
argument is a Maritime Mobile Service Identity uniquely identifying a vessel) it would need to keep a BehaviorSubject<T>
running for every single vessel you were interested in, which might be impractical.
StartWith
运算符可能让您联想到 BehaviorSubject<T>
,因为两者都能确保消费者订阅时立即获得值。但存在本质区别: BehaviorSubject<T>
会缓存其自身源发出的最新值。您可能认为这更适合实现船舶导航示例,然而由于该示例需要为任意一个船舶(mmsi
参数是一个海上移动服务身份,用于唯一标识一艘船舶)返回相应的数据源,若采用 BehaviorSubject<T>
则需为每个关注船舶维护运行中的实例,这在实践中可能不可行。
BehaviorSubject<T>
can hold onto only one value, which is fine for this AIS scenario, and Prepend
shares this limitation. But what if you need a source to begin with some particular sequence?
BehaviorSubject<T>
仅能保存单个值,这对于AIS场景来说是可接受的,而 Prepend
运算符也存在相同限制。但如果需要数据源以一个特定序列开始该怎么办?
StartWith
StartWith
is a generalization of Prepend
that enables us to provide any number of values to emit immediately upon subscription. As with Prepend
, it will then go on to forward any further notifications that emerge from the source.
StartWith
是 Prepend
的泛化版本,它允许我们在订阅时立即发出任意数量的初始值。与 Prepend
一样,它将继续转发源序列的所有后续通知。
As you can see from its signature, this method takes a params
array of values so you can pass in as many or as few values as you need:
如方法签名所示,该方法接受 params
修饰的值数组参数,您可按需传递任意数量参数:
// prefixes a sequence of values to an observable sequence. public static IObservable<TSource> StartWith<TSource>( this IObservable<TSource> source, params TSource[] values)
There's also an overload that accepts an IEnumerable<T>
. Note that Rx will not defer its enumeration of this. StartWith
immediately converts the IEnumerable<T>
into an array before returning.
此外还存在接受 IEnumerable<T>
的重载版本。需特别注意:Rx框架不会延迟枚举操作,StartWith
会在返回前立即将 IEnumerable<T>
转换为数组。
StartsWith
is not a common LINQ operator, and its existence is peculiar to Rx. If you imagine what StartsWith
would look like in LINQ to Objects, it would not be meaningfully different from Concat
. There's a difference in Rx because StartsWith
effectively bridges between pull and push worlds. It effectively converts the items we supply into an observable, and it then concatenates the source
argument onto that.
StartsWith
并非LINQ的常规运算符,其存在是Rx框架特有的设计。若设想LINQ to Objects中的StartsWith
实现,其功能与 Concat
运算符并无实质差异。但在Rx中,StartsWith
的特殊性在于它有效地在拉取和推送世界之间建立了桥梁——实际上将我们提供的项转换为可观察序列,然后将源参数连接到该序列之后。
Append
The existence of Prepend
might lead you to wonder whether there is an Append
for adding a single item onto the end of any IObservable<T>
. After all, this is a common LINQ operator; LINQ to Objects has an Append
implementation, for example. And Rx does indeed supply such a thing:
Prepend
运算符的存在自然会引发疑问:是否也存在对应的 Append
运算符,用于在IObservable<T>序列末尾添加单个项?毕竟这是LINQ的常规操作——例如LINQ to Objects就内置了 Append
实现。事实上,Rx框架确实提供了这样的运算符:
IObservable<string> oneMore = arguments.Append("And another thing...");
There is no corresponding EndWith
. There's no fundamental reason that there couldn't be such a thing it's just that apparently there's not much demand—the Rx repository has not yet had a feature request. So although the symmetry of Prepend
and Append
does suggest that there could be a similar symmetry between StartWith
and an as-yet-hypothetical EndWith
, the absence of this counterpart doesn't seem to have caused any problems. There's an obvious value to being able to create observable sources that always immediately produce a useful output; it's not clear what EndWith
would be useful for, besides satisfying a craving for symmetry.
目前Rx框架中不存在对应的 EndWith
运算符。这并非出于技术限制(理论上完全可以实现),而是由于实际需求不足——Rx代码库至今未收到相关功能请求。尽管 Prepend
与 Append
的对称性暗示 StartWith
理应存在类似对称运算符,但这一对应项的缺失似乎并未引发实际问题。能够创建立即产出有效输出的可观察源具有明确价值;而 EndWith
除了满足对称性追求外,其实际应用场景尚不明确。
DefaultIfEmpty
The next operator we'll examine doesn't strictly performs sequential combination. However, it's a very close relative of Append
and Prepend
. Like those operators, this will emit everything their source does. And like those operators, DefaultIfEmpty
takes one additional item. The difference is that it won't always emit that additional item.
接下来要讨论的 DefaultIfEmpty
运算符并不严格属于顺序组合操作符,但与 Append
和 Prepend
非常相似,它会转发源序列的所有项;也如它们一般, DefaultIfEmpty
会携带一个额外项。关键区别在于:该运算符仅在源序列为空时才会发射这个额外项(默认值)。
Whereas Prepend
emits its additional item at the start, and Append
emits its additional item at the end, DefaultIfEmpty
emits the additional item only if the source completes without producing anything. So this provides a way of guaranteeing that an observable will not be empty.
与 Prepend
在序列开头添加额外项、 Append
在末尾追加项不同, DefaultIfEmpty
仅当源序列完成且未产生任何项时才会发出额外项。该运算符由此提供了确保可观察序列非空的保障机制。
You don't have to supply DefaultIfEmpty
with a value. If you use the overload in which you supply no such value, it will just use default(T)
. This will be a zero-like value for struct types and null
for reference types.
使用 DefaultIfEmpty
时可不指定参数值。若调用无参重载,运算符将自动使用 default(T)
作为默认值——对于结构体类型将生成类零值,引用类型则返回 null
。
Repeat
The final operator that combines sequences sequentially is Repeat
. It allows you to simply repeat a sequence. It offers overloads where you can specify the number of times to repeat the input, and one that repeats infinitely:
最后一个按顺序组合序列的操作符是 Repeat
。它允许你简单地重复一个序列。它提供了多个重载版本,你可以在其中指定重复输入的次数,以及实现无限循环的无限重载:
// Repeats the observable sequence a specified number of times. public static IObservable<TSource> Repeat<TSource>( this IObservable<TSource> source, int repeatCount) // Repeats the observable sequence indefinitely and sequentially. public static IObservable<TSource> Repeat<TSource>( this IObservable<TSource> source)
Repeat
resubscribes to the source for each repetition. This means that this will only strictly repeat if the source produces the same items each time you subscribe. Unlike the ReplaySubject<T>
, this doesn't store and replay the items that emerge from the source. This means that you normally won't want to call Repeat
on a hot source. (If you really want repetition of the output of a hot source, a combination of Replay
and Repeat
might fit the bill.)
Repeat
运算符通过重新订阅源序列实现循环逻辑。这意味着只有当源在每次订阅时都能生成相同项时,才能保证严格意义上的重复。与 ReplaySubject<T>
不同,Repeat
不会缓存并重放源序列的项。因此通常不建议在热源上使用 Repeat
运算符(若确实需要重复热源输出,可结合 Replay
与 Repeat
运算符实现)。
If you use the overload that repeats indefinitely, then the only way the sequence will stop is if there is an error or the subscription is disposed of. The overload that specifies a repeat count will stop on error, un-subscription, or when it reaches that count. This example shows the sequence [0,1,2] being repeated three times.
若使用无限循环重载,序列仅会在发生错误或订阅被释放时终止。而指定重复次数的重载版本则会在以下三种情况之一终止:遇到错误、取消订阅或达到指定重复次数。以下示例展示了序列[0,1,2]被重复三次的执行过程:
var source = Observable.Range(0, 3); var result = source.Repeat(3); result.Subscribe( Console.WriteLine, () => Console.WriteLine("Completed"));
Output:
输出:
0 1 2 0 1 2 0 1 2 Completed
Concurrent sequences 并发序列
We'll now move on to operators for combining observable sequences that might produce values concurrently.
接下来我们将转向用于组合可能并发产生值的可观察序列的运算符。
Amb
Amb
is a strangely named operator. It's short for ambiguous, but that doesn't tell us much more than Amb
. If you're curious about the name you can read about the origins of Amb
in Appendix D, but for now, let's look at what it actually does. Rx's Amb
takes any number of IObservable<T>
sources as inputs, and waits to see which, if any, first produces some sort of output. As soon as this happens, it immediately unsubscribes from all of the other sources, and forwards all notifications from the source that reacted first.
Amb
是一个命名奇特的运算符(名称源自"ambiguous/模糊选择",具体渊源可参考附录D)。该运算符接收多个 IObservable<T>
源作为输入,等待并观察哪个源首先产生输出。一旦检测到某个源率先响应,立即取消订阅所有其他源,并持续转发该胜出源的所有通知。
Why is that useful?
这有什么用处呢?
A common use case for Amb
is when you want to produce some sort of result as quickly as possible, and you have multiple options for obtaining that result, but you don't know in advance which will be fastest. Perhaps there are multiple servers that could all potentially give you the answer you want, and it's impossible to predict which will have the lowest response time. You could send requests to all of them, and then just use the first to respond. If you model each individual request as its own IObservable<T>
, Amb
can handle this for you. Note that this isn't very efficient: you're asking several servers all to do the same work, and you're going to discard the results from most of them. (Since Amb
unsubscribes from all the sources it's not going to use as soon as the first reacts, it's possible that you might be able to send a message to all the other servers to cancel the request. But this is still somewhat wasteful.) But there may be scenarios in which timeliness is crucial, and for those cases it might be worth tolerating a bit of wasted effort to produce faster results.
Amb
的一个常见使用场景是:当你希望尽可能快地生成某种结果,并且有多个可获取该结果的选项,但事先无法预知哪个选项会最快。例如,可能存在多个服务器都能提供你所需的答案,但无法预测哪个服务器的响应时间最短。这时你可以向所有服务器发送请求,然后只使用第一个响应。如果将每个单独的请求建模为各自的 IObservable<T>
, Amb
可以为你处理这种情况。需要注意的是,这种方式效率不高:你让多个服务器都执行相同的工作,却会丢弃其中大部分的结果。(由于 Amb
在第一个响应出现后就会取消订阅所有不再需要的数据源,理论上你可以向其他所有服务器发送消息取消请求。但这仍然存在一定程度的资源浪费。)不过在某些时效性至关重要的场景中,为了更快地获得结果,容忍少量资源浪费可能是值得的。
Amb
is broadly similar to Task.WhenAny
, in that it lets you detect when the first of multiple sources does something. However, the analogy is not precise. Amb
automatically unsubscribes from all of the other sources, ensuring that everything is cleaned up. With Task
you should always ensure that you eventually observe all tasks in case any of them faulted.
Amb
与 Task.WhenAny
大体相似,因为它允许你检测多个数据源中第一个触发动作的源。然而,这种类比并不完全准确。 Amb
会自动取消订阅所有其他数据源,确保所有资源都被清理干净。而使用 Task
时,你必须始终确保最终处理所有任务(例如观察任务结果或处理异常),以防其中某些任务出现故障。
To illustrate Amb
's behaviour, here's a marble diagram showing three sequences, s1
, s2
, and s3
, each able to produce a sequence values. The line labelled r
shows the result of passing all three sequences into Amb
. As you can see, r
provides exactly the same notifications as s1
, because in this example, s1
was the first sequence to produce a value.
为了说明 Amb
的行为,这里展示一个弹珠图:三个序列 s1、s2 和 s3 各自能够生成一系列值。标有 r 的线条表示将这三个序列传入 Amb
后的结果。如你所见,r 提供与 s1 完全相同的通知,因为在此示例中,s1 是第一个生成值的序列。
This code creates exactly the situation described in that marble diagram, to verify that this is indeed how Amb
behaves:
以下代码精确复现了弹珠图所描述的情境,以验证 Amb
的行为确实如此:
var s1 = new Subject<int>(); var s2 = new Subject<int>(); var s3 = new Subject<int>(); var result = Observable.Amb(s1, s2, s3); result.Subscribe( Console.WriteLine, () => Console.WriteLine("Completed")); s1.OnNext(1); s2.OnNext(99); s3.OnNext(8); s1.OnNext(2); s2.OnNext(88); s3.OnNext(7); s2.OnCompleted(); s1.OnNext(3); s3.OnNext(6); s1.OnNext(4); s1.OnCompleted(); s3.OnCompleted();
Output:
输出:
1 2 3 4 Completed
If we changed the order so that s2.OnNext(99)
came before the call to s1.OnNext(1);
then s2 would produce values first and the marble diagram would look like this.
如果我们调整顺序,让 s2.OnNext(99)
在 s1.OnNext(1)
之前被调用,那么 s2 将率先生成值,此时弹珠图的呈现形态会变为如下所示:
There are a few overloads of Amb
. The preceding example used the overload that takes a params
array of sequences. There's also an overload that takes exactly two sources, avoiding the array allocation that occurs with params
. Finally, you could pass in an IEnumerable<IObservable<T>>
. (Note that there are no overloads that take an IObservable<IObservable<T>>
. Amb
requires all of the source observables it monitors to be supplied up front.)
Amb
提供了多个重载版本。前面的示例使用了接受 params
数组形式序列的重载。此外,还有一个只接受两个输入源的重载,可避免 params
参数带来的数组分配开销。最后,你还可以传入一个 IEnumerable<IObservable<T>>
集合。(注意:没有接受 IObservable<IObservable<T>>
的重载。 Amb
要求所有被监控的数据源必须预先全部提供。)
// Propagates the observable sequence that reacts first. public static IObservable<TSource> Amb<TSource>( this IObservable<TSource> first, IObservable<TSource> second) {...} public static IObservable<TSource> Amb<TSource>( params IObservable<TSource>[] sources) {...} public static IObservable<TSource> Amb<TSource>( this IEnumerable<IObservable<TSource>> sources) {...}
Reusing the GetSequences
method from the Concat
section, we see that Amb
evaluates the outer (IEnumerable) sequence completely before subscribing to any of the sequences it returns.
复用来自Concat
部分的 GetSequences
方法,我们可以观察到: Amb
在订阅其返回的序列中的任何一个之前,会先完全评估外层(IEnumerable)序列。
GetSequences().Amb().Dump("Amb");
Output:
输出:
GetSequences() called Yield 1st sequence Yield 2nd sequence Yield 3rd sequence GetSequences() complete 1st subscribed to 2nd subscribed to 3rd subscribed to Amb-->3 Amb completed
Here is the marble diagram illustrating how this code behaves:
以下是弹珠图,展示了这段代码的行为:
Remember that GetSequences
produces its first two observables as soon as it is asked for them, and then waits for 1 second before producing the third and final one. But unlike Concat
, Amb
won't subscribe to any of its sources until it has retrieved all of them from the iterator, which is why this marble diagram shows the subscriptions to all three sources starting after 1 second. (The first two sources were available earlier—Amb
would have started enumerating the sources as soon as subscription occurred, but it waited until it had all three before subscribing, which is why they all appear over on the right.) The third sequence has the shortest delay between subscription and producing its value, so although it's the last observable returned, it is able to produce its value the fastest even though there are two sequences yielded one second before it (due to the Thread.Sleep
).
需要记住的是, GetSequences
方法在被调用时会立即生成前两个 Observable,随后等待 1 秒才会生成第三个(即最后一个)Observable。但与 Concat
不同, Amb
在从迭代器中获取所有数据源之前,不会订阅其中的任何一个。这解释了为何弹珠图显示所有三个数据源的订阅均在 1 秒后才开始。(前两个数据源本可以更早被获取 —— Amb
在订阅发生时就会立即开始枚举数据源,但它会等待所有三个数据源就绪后才进行订阅,因此它们在图中均出现在右侧。)第三个序列在订阅到生成值之间的延迟最短,因此尽管它是最后返回的 Observable,却能够比其他两个序列(即使它们早 1 秒被生成,因 Thread.Sleep
的存在)更快地产生值。
Merge
The Merge
extension method takes multiple sequences as its input. Any time any of those input sequences produces a value, the observable returned by Merge
produces that same value. If the input sequences produce values at the same time on different threads, Merge
handles this safely, ensuring that it delivers items one at a time.
Merge
扩展方法接收多个序列作为输入。每当其中任意一个输入序列生成值时,由 Merge
返回的 Observable 也会生成相同的值。即使这些输入序列在不同线程上同时生成值, Merge
也能够安全处理,确保每次只传递一个数据项。
Since Merge
returns a single observable sequence that includes all of the values from all of its input sequences, there's a sense in which it is similar to Concat
. But whereas Concat
waits until each input sequence completes before moving onto the next, Merge
supports concurrently active sequences. As soon as you subscribe to the observable returned by Merge
, it immediately subscribes to all of its inputs, forwarding everything any of them produces. This marble diagram shows two sequences, s1
and s2
, running concurrently and r
shows the effect of combining these with Merge
: the values from both source sequences emerge from the merged sequence.
由于 Merge
返回一个单一的可观察序列,该序列包含所有输入序列的所有值,因此在某种意义上,它与 Concat
相似。但是, Concat
会等待当前输入序列完成后再处理下一个,而Merge
支持并发活动的序列。一旦你订阅了 Merge
返回的可观察对象,它就会立即订阅其所有输入源,并转发它们产生的所有内容。这个弹珠图显示了两个序列 s1 和 s2 同时运行,而 r 则显示了使用 Merge
将它们合并的效果:合并后的序列会发出两个源序列中的值。
The result of a Merge
will complete only once all input sequences complete. However, the Merge
operator will error if any of the input sequences terminates erroneously (at which point it will unsubscribe from all its other inputs).
Merge
运算符返回的结果序列会在所有输入序列完成后才完成。然而,如果任何一个输入序列因错误终止(此时它会取消订阅所有其他输入源),则 Merge
运算符会立即传播该错误。
If you read the Creating Observables chapter, you've already seen one example of Merge
. I used it to combine the individual sequences representing the various events provided by a FileSystemWatcher
into a single stream at the end of the 'Representing Filesystem Events in Rx' section. As another example, let's look at AIS once again. There is no publicly available single global source that can provide all AIS messages across the entire globe as an IObservable<IAisMessage>
. Any single source is likely to cover just one area, or maybe even just a single AIS receiver. With Merge
, it's straightforward to combine these into a single source:
如果你阅读过《创建可观察对象》的章节,那么你已经见过一个 Merge
的示例。在“在 Rx 中表示文件系统事件”一节的最后,我使用 Merge
将表示由 FileSystemWatcher
提供的各种事件的独立序列合并为一个统一的流。再举一个例子,让我们再次考虑 AIS(自动识别系统)。目前没有公开可用的单一全球数据源能够以 IObservable<IAisMessage>
的形式提供全球范围内的所有 AIS 消息。任何单一数据源可能仅覆盖某一区域,甚至仅来自单个 AIS 接收器。利用 Merge
,可以轻松将它们合并为一个单一的数据源:
IObservable<IAisMessage> station1 = aisStations.GetMessagesFromStation("AdurStation"); IObservable<IAisMessage> station2 = aisStations.GetMessagesFromStation("EastbourneStation"); IObservable<IAisMessage> allMessages = station1.Merge(station2);
If you want to combine more than two sources, you have a few options:
如果你想合并超过两个源,你有几个选项:
- Chain
Merge
operators together e.g.
将s1.Merge(s2).Merge(s3)
Merge
运算符链接在一起,例如s1.Merge(s2).Merge(s3)
- Pass a
params
array of sequences to theObservable.Merge
static method. e.g.
将序列的Observable.Merge(s1,s2,s3)
params
数组传递给Observable.Merge
静态方法,例如Observable.Merge(s1, s2, s3)
- Apply the
Merge
operator to anIEnumerable<IObservable<T>>
. 对IEnumerable<IObservable<T>>
应用Merge
操作符 - Apply the
Merge
operator to anIObservable<IObservable<T>>
. 对IObservable<IObservable<T>>
应用Merge
操作符
The overloads look like this:
Merge
的重载版本如下所示:
/// Merges two observable sequences into a single observable sequence. /// Returns a sequence that merges the elements of the given sequences. public static IObservable<TSource> Merge<TSource>( this IObservable<TSource> first, IObservable<TSource> second) {...} // Merges all the observable sequences into a single observable sequence. // The observable sequence that merges the elements of the observable sequences. public static IObservable<TSource> Merge<TSource>( params IObservable<TSource>[] sources) {...} // Merges an enumerable sequence of observable sequences into a single observable sequence. public static IObservable<TSource> Merge<TSource>( this IEnumerable<IObservable<TSource>> sources) {...} // Merges an observable sequence of observable sequences into an observable sequence. // Merges all the elements of the inner sequences in to the output sequence. public static IObservable<TSource> Merge<TSource>( this IObservable<IObservable<TSource>> sources) {...}
As the number of sources being merged goes up, the operators that take collections have an advantage over the first overload. (I.e., s1.Merge(s2).Merge(s3)
performs slightly less well than Observable.Merge(new[] { s1, s2, s3 })
, or the equivalent Observable.Merge(s1, s2, s3)
.) However, for just three or four, the differences are small, so in practice you can choose between the first two overloads as a matter of your preferred style. (If you're merging 100 sources or more the differences are more pronounced, but by that stage, the you probably wouldn't want to use the chained call style anyway.) The third and fourth overloads allow to you merge sequences that can be evaluated lazily at run time.
随着要合并的源数量的增加,接受集合作为参数的重载版本相比第一个重载(链式调用)更具性能优势。(即,s1.Merge(s2).Merge(s3)
的性能略逊于 Observable.Merge(new[] { s1, s2, s3 })
或等效的 Observable.Merge(s1, s2, s3)
。)然而,对于只有 3~4 数据源的情况,性能差异很小,因此在实际使用中,你可以根据个人喜好的风格在前两个重载之间进行选择。(如果你正在合并 100 个或更多的源,性能差异会更显著,但此时链式调用的方式本身也不适用。)第三个和第四个重载允许你在运行时懒加载地合并序列。
That last Merge
overload that takes a sequence of sequences is particularly interesting, because it makes it possible for the set of sources being merged to grow over time. Merge
will remain subscribed to sources
for as long as your code remains subscribed to the IObservable<T>
that Merge
returns. So if sources
emits more and more IObservable<T>
s over time, these will all be included by Merge
.
最后一个接受嵌套序列作为参数的 Merge
重载尤为有趣,因为它使得被合并的源集合能够随时间动态扩展。只要你的代码保持订阅由 Merge
返回的 IObservable<T>
, Merge
就会持续订阅这些数据源 sources
。因此,如果 sources
(即外层IObservable<IObservable<TSource>>)持续发出新的 IObservable<T>
,所有这些内部序列都会被 Merge
动态纳入合并范围。
That might sound familiar. The SelectMany
operator, which is able to flatten multiple observable sources back out into a single observable source. This is just another illustration of why I've described SelectMany
as a fundamental operator in Rx: strictly speaking we don't need a lot of the operators that Rx gives us because we could build them using SelectMany
. Here's a simple re-implementation of that last Merge
overload using SelectMany
:
这听起来可能似曾相识。SelectMany
运算符同样能够将多个可观察数据源扁平化(flatten)为单一的可观察数据源。这再次印证了为何我将 SelectMany
描述为 Rx 中的基础运算符:严格来说,我们并不需要 Rx 提供的许多其他运算符,因为可以通过 SelectMany
来构建它们。以下是一个使用 SelectMany
重新实现上述最后一个 Merge
重载的简单示例:
public static IObservable<T> MyMerge<T>(this IObservable<IObservable<T>> sources) => sources.SelectMany(source => source);
As well as illustrating that we don't technically need Rx to provide that last Merge
for us, it's also a good illustration of why it's helpful that it does. It's not immediately obvious what this does. Why are we passing a lambda that just returns its argument? Unless you've seen this before, it can take some thought to work out that SelectMany
expects us to pass a callback that it invokes for each incoming item, but that our input items are already nested sequences, so we can just return each item directly, and SelectMany
will then take that and merge everything it produces into its output stream. And even if you have internalized SelectMany
so completely that you know right away that this will just flatten sources
, you'd still probably find Observable.Merge(sources)
a more direct expression of intent. (Also, since Merge
is a more specialized operator, Rx is able to provide a very slightly more efficient implementation of it than the SelectMany
version shown above.)
除了说明从技术角度而言我们并不需要 Rx 专门提供最后一个 Merge
重载外,这也很好地解释了为何 Rx 仍然提供该操作符的实际价值。上述 SelectMany
的用法并不直观,其行为意图并不一目了然。为什么我们要传递一个直接返回输入参数的 lambda 表达式?除非你之前见过这种模式,否则可能需要一番思考才能理解: SelectMany
要求我们传递一个回调函数,该函数会被调用于每个输入项,而我们的输入项本身已经是嵌套的可观察序列,因此可以直接返回每个输入项本身。随后, SelectMany
会自动将其扁平化,将所有生成的元素合并到输出流中。即便你已经完全掌握了 SelectMany
的机制,知道这样做能够将嵌套的 sources
序列扁平化,你仍然会发现 Observable.Merge(sources)
能更直接地表达意图。(此外,由于 Merge
是一个更专用的操作符,Rx 能够提供比上述 SelectMany
版本略微更高效的实现。)
If we again reuse the GetSequences
method, we can see how the Merge
operator works with a sequence of sequences.
再次复用 GetSequences
方法,我们可以观察 Merge
运算符如何处理嵌套的序列集合。
GetSequences().Merge().Dump("Merge");
Output:
输出:
GetSequences() called Yield 1st sequence 1st subscribed to Yield 2nd sequence 2nd subscribed to Merge --> 2 Merge --> 1 Yield 3rd sequence 3rd subscribed to GetSequences() complete Merge --> 3 Merge completed
As we can see from the marble diagram, s1 and s2 are yielded and subscribed to immediately. s3 is not yielded for one second and then is subscribed to. Once all input sequences have completed, the result sequence completes.
从弹珠图中我们可以看出,s1
和 s2
立即产生并被订阅。s3
在一秒后才产生并被订阅。一旦所有输入序列都完成了,合并后的结果序列也随之完成。
For each of the Merge
overloads that accept variable numbers of sources (either via an array, an IEnumerable<IObservable<T>>
, or an IObservable<IObservable<T>>
) there's an additional overload adding a maxconcurrent
parameter. For example:
对于每个接受可变数量数据源的 Merge
重载(无论是通过数组、IEnumerable<IObservable<T>>
还是 IObservable<IObservable<T>>
传入),Rx 还提供了额外支持 maxConcurrent
参数的重载。例如:
public static IObservable<TSource> Merge<TSource>(this IEnumerable<IObservable<TSource>> sources, int maxConcurrent)
This enables you to limit the number of sources that Merge
accepts inputs from at any single time. If the number of sources available exceeds maxConcurrent
(either because you passed in a collection with more sources, or because you used the IObservable<IObservable<T>
-based overload and the source emitted more nested sources than maxConcurrent
) Merge
will wait for existing sources to complete before moving onto new ones. A maxConcurrent
of 1 makes Merge
behave in the same way as Concat
.
通过 maxConcurrent
参数,你可以限制 Merge
在任何时刻同时接收输入的数据源数量。如果可用数据源的数量超过 maxConcurrent
设置的值(可能是由于传入的集合包含更多数据源,或是使用了基于 IObservable<IObservable<T>
的重载且源发射了超过 maxConcurrent
数量的嵌套数据源), Merge
将等待现有数据源完成后,再处理新的数据源。特别地,当 maxConcurrent
设置为 1时, Merge
的行为将与 Concat
完全一致 —— 即所有数据源会按顺序逐一处理,而非并发执行。
Switch
Rx's Switch
operator takes an IObservable<IObservable<T>>
, and produces notifications from the most recent nested observable. Each time its source produces a new nested IObservable<T>
, Switch
unsubscribes from the previous nested source (unless this is the first source, in which case there won't be a previous one) and subscribes to the latest one.
Rx的Switch
运算符接收一个IObservable<IObservable<T>>
类型的输入,并仅从最新的嵌套可观察对象中发出通知。每当其源序列产生一个新的嵌套IObservable<T>
时,Switch
会执行以下操作:
-
取消订阅前一个嵌套源(除非这是第一个源Observable,这种情况下不会有前一个,则无此步骤);
-
立即订阅最新产生的嵌套可观察对象。
Switch
can be used in a 'time to leave' type feature for a calendar application. In fact you can see the source code for a modified version of how Bing provides (or at least provided; the implementation may have changed) notifications telling you that it's time to leave for an appointment. Since that's derived from a real example, it's a little complex, so I'll describe just the essence here.
Switch
运算符可用于实现日历应用中的"该出发了"提醒功能。事实上,你可以参考Bing日历(或至少其历史版本)通知用户出发时间的源码实现的修改版本(注意:实际实现可能已变更)。由于这是基于真实案例的简化,其复杂性较高,以下仅阐述核心原理:
The basic idea with a 'time to leave' notification is that we using map and route finding services to work out the expected journey time to get to wherever the appointment is, and to use the Timer
operator to create an IObservable<T>
that will produce a notification when it's time to leave. (Specifically this code produces an IObservable<TrafficInfo>
which reports the proposed route for the journey, and expected travel time.) However, there are two things that can change, rendering the initial predicted journey time useless. First, traffic conditions can change. When the user created their appointment, we have to guess the expected journey time based on how traffic normally flows at the time of day in question. However, if there turns out to be really bad traffic on the day, the estimate will need to be revised upwards, and we'll need to notify the user earlier.
“出发时间”通知的基本思路是:我们利用地图和路线规划服务来计算到达约定地点的预计行程时间,并使用Timer
运算符创建一个 IObservable<T>
。该可观察对象会在需要出发的时刻触发通知(具体而言,这段代码生成的是 IObservable<TrafficInfo>
,其中包含建议的行车路线和预计行程时间)。然而,有两种动态变化因素会导致初始预测的行程时间失效:
第一种是,交通状况可能发生变化。当用户创建预约时,我们只能基于该时段通常的交通流量来推测行程时间。但若实际当天出现严重交通拥堵,就需要上调预估时间并提前通知用户。
The other thing that can change is the user's location. This will also obviously affect the predicted journey time.
另一个可能发生变化的是用户的位置。这显然也会影响预计的行程时间。
To handle this, the system will need observable sources that can report changes in the user's location, and changes in traffic conditions affecting the proposed journey. Every time either of these reports a change, we will need to produce a new estimated journey time, and a new IObservable<TrafficInfo>
that will produce a notification when it's time to leave.
为了处理这个问题,系统需要接入可观察的数据源来监测以下变化:用户的位置变化,以及可能影响行程的交通状况变化。每当任一数据源检测到变化时,系统都将重新计算预估行程时间,并生成新的 IObservable<TrafficInfo>
实例 —— 该可观察对象会在最新的"出发时间"触发通知。
Every time we revise our estimate, we want to abandon the previously created IObservable<TrafficInfo>
. (Otherwise, the user will receive a bewildering number of notifications telling them to leave, one for every time we recalculated the journey time.) We just want to use the latest one. And that's exactly what Switch
does.
每次重新估算行程时间时,我们都需要弃用先前生成的 IObservable<TrafficInfo>
实例(否则用户会收到大量令人困惑的出发提示通知 —— 每次重新计算行程时间都会生成一个独立通知)。我们的核心需求是始终采用最新生成的实例。而这正是 Switch
运算符的职责所在—— 它能自动切换至最新的可观察数据流,确保只响应最新的行程计算结果。
You can see the example for that scenario in the Reaqtor repo. Here, I'm going to present a different, simpler scenario: live searches. As you type, the text is sent to a search service and the results are returned to you as an observable sequence. Most implementations have a slight delay before sending the request so that unnecessary work does not happen. Imagine I want to search for "Intro to Rx". I quickly type in "Into to" and realize I have missed the letter 'r'. I stop briefly and change the text to "Intro ". By now, two searches have been sent to the server. The first search will return results that I do not want. Furthermore, if I were to receive data for the first search merged together with results for the second search, it would be a very odd experience for the user. I really only want results corresponding to the latest search text. This scenario fits perfectly with the Switch
method.
你可以在Reaqtor代码库中找到该场景的示例。在此,我将展示另一个更简单的场景:实时搜索。当用户输入文字时,输入的文本会被实时发送至搜索服务,搜索结果以可观察序列的形式返回。大多数实现会在发送请求前设置短暂延迟,以避免不必要的请求。例如,假设我想搜索 "Intro to Rx",快速输入了"Into to"后,突然意识到漏掉了字母'r'。于是暂停片刻,将文本修改为"Intro "。此时,系统已向服务器发送了两次搜索请求。第一次搜索返回的结果显然不符合需求。更糟糕的是,如果用户界面同时展示第一次和第二次搜索的结果,将会造成混乱的体验。我们真正需要的,是始终仅展示与最新搜索文本对应的结果 —— 该场景正是 Switch
方法的完美应用场景。
In this example, there is an IObservable<string>
source that represents the search text—each new value the user types emerges from this source sequence. We also have a search function that produces a single search result for a given search term:
在这个例子中,有一个IObservable<string>源,它表示搜索文本——用户输入的每个新值都从这个源序列中产生。我们还有一个搜索函数,可为给定搜索词生成单个搜索结果:
private IObservable<string> SearchResults(string query) { ... }
This returns just a single value, but we model it as an IObservable<string>
partly to deal with the fact that it might take some time to perform the search, and also to be enable to use it with Rx. We can take our source of search terms, and then use Select
to pass each new search value to this SearchResults
function. This creates our resulting nested sequence, IObservable<IObservable<string>>
.
虽然这里返回的只是一个单独的值,但我们将其建模为IObservable<string>
,部分原因是为了处理执行搜索可能需要一些时间的实际情况,同时也是为了能够与Rx配合使用。我们可以接收搜索关键词的输入源,然后通过 Select
运算符将每个新的搜索值传递给这个 SearchResults
函数。这样就创建出了一个嵌套的序列结构——IObservable<IObservable<string>>
。
Suppose we were to then use Merge
to process the results:
假设我们接下来使用 Merge
来处理这些结果:
IObservable<string> searchValues = ....; IObservable<IObservable<string>> search = searchValues.Select(searchText => SearchResults(searchText)); var subscription = search .Merge() .Subscribe(Console.WriteLine);
If we were lucky and each search completed before the next element from searchValues
was produced, the output would look sensible. However, it is much more likely, however that multiple searches will result in overlapped search results. This marble diagram shows what the Merge
function could do in such a situation.
如果我们足够幸运,并且每次搜索都能在下一个searchValues
元素产生前完成,那么输出结果看起来会很合理。然而,更有可能的情况是,多个搜索会导致返回的搜索结果出现重叠。这张弹珠图展示了Merge
函数在此类场景中可能的效果。
Note how the values from the search results are all mixed together. The fact that some search terms took longer to get a search result than others has also meant that they have come out in the wrong order. This is not what we want. If we use the Switch
extension method we will get much better results. Switch
will subscribe to the outer sequence and as each inner sequence is yielded it will subscribe to the new inner sequence and dispose of the subscription to the previous inner sequence. This will result in the following marble diagram:
注意,来自不同搜索的结果值会被全部混杂在一起。某些搜索词获取结果的时间比其他词更长,这也导致了它们的返回顺序出现了错乱。这显然不符合我们的预期。如果我们改用Switch
扩展方法,结果将得到显著改善。Switch
会订阅外层序列,每当新的内层序列产生时,它会立即订阅这个新序列,并取消对前一个内层序列的订阅。这将产生如下弹珠图所示的效果:
Now, each time a new search term arrives, causing a new search to be kicked off, a corresponding new IObservable<string>
for that search's results appears, causing Switch
to unsubscribe from the previous results. This means that any results that arrive too late (i.e., when the result is for a search term that is no longer the one in the search box) will be dropped. As it happens, in this particular example, this means that we only see the result for the final search term. All the intermediate values that we saw as the user was typing didn't hang around for long, because the user kept on pressing the next key before we'd received the previous value's results. Only at the end, when the user stopped typing for long enough that the search results came back before they became out of date, do we finally see a value from Switch
. The net effect is that we've eliminated confusing results that are out of date.
每当有新的搜索词到达并触发新的搜索时,就会生成对应的新IObservable<string>
结果流。此时Switch
会自动取消订阅前一个结果流。这意味着任何延迟到达的结果(即对应搜索框中已被替换的旧搜索词)都将被丢弃。在这个具体案例中,最终我们只会看到最后一个搜索词的结果。用户持续输入期间产生的中间值之所以不会残留,是因为每次按键操作都赶在前一次搜索结果返回之前发生。只有当用户最终停止输入,且搜索结果能在过时之前返回时,Switch
才会输出有效结果。这种机制从根本上消除了过期结果造成的混乱。
This is another diagram where the ambiguity of marble diagrams causes a slight issue. I've shown each of the single-value observables produced by each of the calls to SearchResults
, but in practice Switch
unsubscribes from all but the last of these before they've had a chance to produce a value. So this diagram is showing the values those sources could potentially produce, and not the values that they actually delivered as part of the subscription, because the subscriptions were cut short.
这个弹珠图示例还体现了另一方面:其固有的表达局限性会带来些许理解上的困惑。图中虽然展示了每次调用SearchResults
生成的单值可观察序列(IObservable<string>
),但实际情况是:Switch
会在这些序列尚未产生值之前就取消对所有旧序列的订阅(仅保留最新一个)。因此,图中显示的是这些源序列理论上可能产生的值,而非它们在实际订阅周期内真正传递的值——因为这些订阅被提前取消了。
Pairing sequences 配对序列
The previous methods allowed us to flatten multiple sequences sharing a common type into a result sequence of the same type (with various strategies for deciding what to include and what to discard). The operators in this section still take multiple sequences as an input, but attempt to pair values from each sequence to produce a single value for the output sequence. In some cases, they also allow you to provide sequences of different types.
之前的方法允许我们将多个同类型序列"扁平化"为同类型的结果序列(采用不同策略决定包含或丢弃哪些元素)。而本节的运算符虽然同样接收多个序列作为输入,但其核心机制是尝试将各个序列中的值进行配对,以生成输出序列中的单个值。某些运算符还支持处理不同类型的输入序列。
Zip
Zip
combines pairs of items from two sequences. So its first output is created by combining the first item from one input with the first item from the other. The second output combines the second item from each input. And so on. The name is meant to evoke a zipper on clothing or a bag, which brings the teeth on each half of the zipper together one pair at a time.
Zip
运算符将两个序列中的元素按顺序逐一进行组合。其第一个输出值由第一个输入序列的首项与第二个输入序列的首项组合而成,第二个输出则由两者的第二项组合产生,依此类推。该运算符的命名灵感来源于衣物或包袋上的拉链——将两边的齿牙逐一啮合,每次只处理一对齿牙。
Since Zip
combines pairs of item in strict order, it will complete when the first of the sequences complete. If one of the sequence has reached its end, then even if the other continues to emit values, there will be nothing to pair any of these values with, so Zip
just unsubscribes at this point, discards the unpairable values, and reports completion.
Zip
运算符由于需要严格按顺序配对元素,当任一输入序列率先完成时,整个Zip
流就会完成。此时,即使另一个序列仍在持续产生新值,这些值也会因失去配对对象而被丢弃,Zip
会立即取消订阅所有输入流,并向上游发出完成信号。
If either of the sequences produces an error, the sequence returned by Zip
will report that same error.
若任一输入序列产生错误,Zip
返回的序列将立即报告同样的错误。
If one of the source sequences publishes values faster than the other sequence, the rate of publishing will be dictated by the slower of the two sequences, because it can only emit an item when it has one from each source.
当其中一个源序列发布值的速度快于另一个时,Zip
运算符的整体发布速率将由较慢的序列决定。因为它必须严格等待每个源都提供对应的值后,才能生成并发射一个组合项。
Here's an example:
这里是一个例子:
// Generate values 0,1,2 var nums = Observable.Interval(TimeSpan.FromMilliseconds(250)) .Take(3); // Generate values a,b,c,d,e,f var chars = Observable.Interval(TimeSpan.FromMilliseconds(150)) .Take(6) .Select(i => Char.ConvertFromUtf32((int)i + 97)); // Zip values together nums.Zip(chars, (lhs, rhs) => (lhs, rhs))) .Dump("Zip");
The effect can be seen in this marble diagram below.:
下面的弹珠图展示了这一效果:
Here's the actual output of the code:
这里是代码的实际输出:
{ Left = 0, Right = a } { Left = 1, Right = b } { Left = 2, Right = c }
Note that the nums
sequence only produced three values before completing, while the chars
sequence produced six values. The result sequence produced three values, this was as many pairs is it could make.
注意,nums
序列在完成前只产生了3个值,而chars
序列则产生了6个值。结果序列最终生成了3个值——这是两个序列能够配对的最大数量。
It is also worth noting that Zip
has a second overload that takes an IEnumerable<T>
as the second input sequence.
Zip
运算符还存在一个重载版本,它接受一个 IEnumerable<T>
作为第二个输入序列。
// Merges an observable sequence and an enumerable sequence into one observable sequence // containing the result of pair-wise combining the elements by using the selector function. public static IObservable<TResult> Zip<TFirst, TSecond, TResult>( this IObservable<TFirst> first, IEnumerable<TSecond> second, Func<TFirst, TSecond, TResult> resultSelector) {...}
This allows us to zip sequences from both IEnumerable<T>
and IObservable<T>
paradigms!
这使我们能够将IEnumerable<T>
(拉取模型)与IObservable<T>
(推送模型)这两种不同范式的序列进行跨模型配对!
SequenceEqual
There's another operator that processes pairs of items from two sources: SequenceEqual
. But instead of producing an output for each pair of inputs, this compares each pair, and ultimately produces a single value indicating whether every pair of inputs was equal or not.
有一个处理来自两个源的项目对的运算符: SequenceEqual
。但与为每一对输入生成输出不同,它会比较每一对输入,并最终生成一个单一的值,指示所有输入对是否都相等。
In the case where the sources produce different values, SequenceEqual
produces a single false
value as soon as it detects this. But if the sources are equal, it can only report this when both have completed because until that happens, it doesn't yet know if there might a difference coming later. Here's an example illustrating its behaviour:
当两个数据源产生的值不同时, SequenceEqual
一旦检测到差异就会立即返回一个 false
值。但如果数据源内容相同,它只能在两者都完成后才能确认这一点——因为在两者完成之前,它无法确定后续是否会出现差异。以下是一个演示其行为的示例:
var subject1 = new Subject<int>(); subject1.Subscribe( i => Console.WriteLine($"subject1.OnNext({i})"), () => Console.WriteLine("subject1 completed")); var subject2 = new Subject<int>(); subject2.Subscribe( i => Console.WriteLine($"subject2.OnNext({i})"), () => Console.WriteLine("subject2 completed")); var areEqual = subject1.SequenceEqual(subject2); areEqual.Subscribe( i => Console.WriteLine($"areEqual.OnNext({i})"), () => Console.WriteLine("areEqual completed")); subject1.OnNext(1); subject1.OnNext(2); subject2.OnNext(1); subject2.OnNext(2); subject2.OnNext(3); subject1.OnNext(3); subject1.OnCompleted(); subject2.OnCompleted();
Output:
输出:
subject1.OnNext(1) subject1.OnNext(2) subject2.OnNext(1) subject2.OnNext(2) subject2.OnNext(3) subject1.OnNext(3) subject1 completed subject2 completed areEqual.OnNext(True) areEqual completed
CombineLatest
The CombineLatest
operator is similar to Zip
in that it combines pairs of items from its sources. However, instead of pairing the first items, then the second, and so on, CombineLatest
produces an output any time either of its inputs produces a new value. For each new value to emerge from an input, CombineLatest
uses that along with the most recently seen value from the other input. (To be precise, it doesn't produce anything until each input has produced at least one value, so if one input takes longer to get started than the other, there will be a period in which CombineLatest
doesn't in fact produce an output each time one of its inputs does, because it's waiting for the other to produce its first value.) The signature is as follows.
CombineLatest
操作符与 Zip
类似,都能合并来自两个数据源的项对。但与 Zip
按顺序(第一项配对、第二项配对等)组合不同, CombineLatest
会在任一输入源产生新值时立即生成输出。每当一个输入源产生新值时, CombineLatest
会将该值与另一输入源最近接收到的值组合。(准确地说,它会在每个输入源至少产生一个值后才开始输出。因此,如果一个输入源的启动时间比另一个长,在等待另一输入源产生首个值时,即使其中一个输入源持续产生值, CombineLatest
也会暂时处于静默状态。)其定义如下:
// Composes two observable sequences into one observable sequence by using the selector // function whenever one of the observable sequences produces an element. public static IObservable<TResult> CombineLatest<TFirst, TSecond, TResult>( this IObservable<TFirst> first, IObservable<TSecond> second, Func<TFirst, TSecond, TResult> resultSelector) {...}
The marble diagram below shows off usage of CombineLatest
with one sequence that produces numbers, and the other letters (s2
). If the resultSelector
function just joins the number and letter together as a pair, this would produce the result shown on the bottom line. I've colour coded each output to indicate which of the two sources caused it to emit that particular result, but as you can see, each output includes a value from each source.
下方的弹珠图展示了 CombineLatest
运算符的用法:一个数据流 (s1
)生成数字,另一个数据流 (s2
)生成字母。若 resultSelector
简单地将数字与字母组合成对,则会生成底部线条所示的结果。我通过颜色标记了每个输出项,以表明是哪个数据源触发了该次结果生成。但如你所见,每个输出项始终包含来自两个数据源的值。
If we slowly walk through the above marble diagram, we first see that s2
produces the letter 'a'. s1
has not produced any value yet so there is nothing to pair, meaning that no value is produced for the result. Next, s1
produces the number '1' so the result sequence can now produce a pair '1,a'. We then receive the number '2' from s1
. The last letter is still 'a' so the next pair is '2,a'. The letter 'b' is then produced creating the pair '2,b', followed by 'c' giving '2,c'. Finally the number 3 is produced and we get the pair '3,c'.
让我们逐步解析上述弹珠图的过程:首先, s2
生成了字母 'a',此时 s1
尚未生成任何值,因此没有可配对的内容,结果序列暂不输出。接着, s1
生成了数字 '1',此时两个数据源均有值,结果序列生成配对 '1,a'。随后, s1
又生成数字 '2',由于 s2
的最新值仍为 'a',因此生成配对 '2,a'。当 s2
生成字母 'b' 时,与 s1
的最新值 '2' 组合,得到配对 '2,b';同理, s2
生成 'c' 时生成配对 '2,c'。最后, s1
生成数字 '3',与 s2
的最新值 'c' 结合,最终输出配对 '3,c'。
This is great in case you need to evaluate some combination of state which needs to be kept up-to-date when any single component of that state changes. A simple example would be a monitoring system. Each service is represented by a sequence that returns a Boolean indicating the availability of said service. The monitoring status is green if all services are available; we can achieve this by having the result selector perform a logical AND. Here is an example.
当需要实时评估某个组合状态(且该状态需在任何单个组件发生变化时保持更新)时, CombineLatest
非常适用。一个典型场景是监控系统:每个服务对应一个返回布尔值的序列(表示该服务是否可用)。如果所有服务均可用,则监控状态显示为绿色——这可以通过 resultSelector
执行逻辑与(AND)操作来实现。示例如下:
IObservable<bool> webServerStatus = GetWebStatus(); IObservable<bool> databaseStatus = GetDBStatus(); // Yields true when both systems are up. var systemStatus = webServerStatus .CombineLatest( databaseStatus, (webStatus, dbStatus) => webStatus && dbStatus);
You may have noticed that this method could produce a lot of duplicate values. For example, if the web server goes down the result sequence will yield 'false
'. If the database then goes down, another (unnecessary) 'false
' value will be yielded. This would be an appropriate time to use the DistinctUntilChanged
extension method. The corrected code would look like the example below.
你可能会注意到,这种方法可能产生大量重复值。例如,如果 Web 服务器宕机,结果序列将产生 'false
';如果数据库随后也宕机,又会生成另一个(冗余的) 'false
'值。此时正是使用 DistinctUntilChanged
扩展方法的合适场景。修正后的代码示例如下:
// Yields true when both systems are up, and only on change of status var systemStatus = webServerStatus .CombineLatest( databaseStatus, (webStatus, dbStatus) => webStatus && dbStatus) .DistinctUntilChanged();
Join
The Join
operator allows you to logically join two sequences. Whereas the Zip
operator would pair values from the two sequences based on their position within the sequence, the Join
operator allows you join sequences based on when elements are emitted.
Join
运算符允许您在逻辑上连接两个序列。与 Zip
运算符(根据元素在序列中的位置进行配对)不同, Join
运算符基于元素的发射时机进行序列连接。
Since the production of a value by an observable source is logically an instantaneous event, joins use a model of intersecting windows. Recall that with the Window
operator, you can define the duration of each window using an observable sequence. The Join
operator uses a similar concept: for each source, we can define a time window over which each element is considered to be 'current' and two elements from different sources will be joined if their time windows overlap. As the Zip
operator, we also need to provide a selector function to produce the result item from each pair of values. Here's the Join
operator:
由于从逻辑上讲,可观察源生成值是一个瞬时事件,因此 Join
运算符采用了一种窗口相交模型。回顾 Window
运算符,您可以通过一个可观察序列来定义每个窗口的持续时间。 Join
运算符采用了类似的概念:对于每个数据源中的元素,我们可以定义一个时间窗口,在此期间该元素被视为"当前有效"。当两个不同源中的元素时间窗口存在重叠时,它们将被连接。与 Zip
运算符类似,我们仍需提供一个选择器函数( resultSelector
),用于根据每对值生成结果项。以下是 Join
运算符的定义:
public static IObservable<TResult> Join<TLeft, TRight, TLeftDuration, TRightDuration, TResult> ( this IObservable<TLeft> left, IObservable<TRight> right, Func<TLeft, IObservable<TLeftDuration>> leftDurationSelector, Func<TRight, IObservable<TRightDuration>> rightDurationSelector, Func<TLeft, TRight, TResult> resultSelector )
This is a complex signature to try and understand in one go, so let's take it one parameter at a time.
这个方法的签名较为复杂,很难一次就理解清楚,因此我们不妨逐个参数拆解分析。
IObservable<TLeft> left
is the first source sequence. IObservable<TRight> right
is the second source sequence. Join
is looking to produce pairs of items, with each pair containing one element from left
and one element from right
.
IObservable<TLeft> left
是第一个源序列, IObservable<TRight> right
是第二个源序列。 Join
运算符旨在生成由成对项组成的结果,其中每一对都包含来自 left
的一个元素和 right
的一个元素。
The leftDurationSelector
argument enables us to define the time window for each item from left
. A source item's time window begins when the source emits the item. To determine when the window for an item from left
should close, Join
will invoke the leftDurationSelector
, passing in the value just produced by left
. This selector must return an observable source. (It doesn't matter at all what the element type of this source is, because Join
is only interested in when it does things.) The item's time window ends as soon as the source returned for that item by leftDurationSelector
either produces a value or completes.
leftDurationSelector
参数使我们能够定义来自左序列( left
)的每个项的时间窗口。一个源项的时间窗口从该源( left
)发射该项时开始。为了确定来自左序列的某个项的时间窗口何时关闭, Join
会调用 leftDurationSelector
函数,并传入 left
刚产生的项的值。此选择器必须返回一个可观察源。(该源的元素类型无关紧要,因为 Join
只关心它的行为时机。)该时间窗口将在 leftDurationSelector
为此项返回的源产生一个值或完成时立即结束。
The rightDurationSelector
argument defines the time window for each item from right
. It works in exactly the same way as the leftDurationSelector
.
rightDurationSelector
参数用于定义来自右序列( right
)的每个项的时间窗口。其工作方式与 leftDurationSelector
完全相同。
Initially, there are no current items. But as left
and right
produce items, these items' windows will start, so Join
might have multiple items all with their windows currently open. Each time left
produces a new item, Join
looks to see if any items from right
still have their windows open. If they do, left
is now paired with each of them. (So a single item from one source might be joined with multiple items from the other source.) Join
calls the resultSelector
for each such pairing. Likewise, each time right
produces an item, then if there are any currently open windows for items from left
, that new item from right
will be paired with each of these, and again, resultSelector
will be called for each such pairing.
最初,没有处于窗口期的当前项。但随着左序列 left
和右序列 right
不断产生项,这些项的窗口期开始生效,因此Join
运算符可能同时管理多个处于开放窗口期的项。每当左序列 left
产生一个新项时,Join
会检查右序列 right
中是否存在仍处于窗口开放期的项。如果存在,左序列 left
的新项将与右序列 right
中所有处于开放窗口期的项逐一配对。(因此,来自一个源的单个项可能与另一源的多个项进行连接。)对于每对匹配项,Join
都会调用 resultSelector
函数生成结果。同理,当右序列 right
产生新项时,如果左序列 left
中存在处于窗口开放期的项,右序列 right
的新项也会与这些项逐一配对,并触发 resultSelector
的调用。
The observable returned by Join
produces the result of each call to resultSelector
.
由 Join
运算符返回的可观察对象会产生每次调用 resultSelector
函数所产生的结果。
Let us now imagine a scenario where the left sequence produces values twice as fast as the right sequence. Imagine that in addition we never close the left windows; we could do this by always returning Observable.Never<Unit>()
from the leftDurationSelector
function. And imagine that we make the right windows close as soon as they possibly can, which we can achieve by making rightDurationSelector
return Observable.Empty<Unit>()
. The following marble diagram illustrates this:
假设现在有一个场景:左序列(left
)生成值的速度是右序列(right
)的两倍。同时,我们设定左窗口永不关闭——这可以通过让 leftDurationSelector
始终返回 Observable.Never<Unit>()
来实现。而对于右窗口,我们让其立即关闭,这可以通过让 rightDurationSelector
返回 Observable.Empty<Unit>()
实现。下面的弹珠图展示了此场景的运行过程:
Each time a left duration window intersects with a right duration window, we get an output. The right duration windows are all effectively of zero length, but this doesn't stop them from intersecting with the left duration windows, because those all never end. So the first item from right has a (zero-length) window that falls inside two of the windows for the left
items, and so Join
produces two results. I've stacked these vertically on the diagram to show that they happen at virtually the same time. Of course, the rules of IObserver<T>
mean that they can't actually happen at the same time: Join
has to wait until the consumer's OnNext
has finished processing 0,A
before it can go on to produce 1,A
. But it will produce all the pairs as quickly as possible any time a single event from one source overlaps with multiple windows for the other.
每当 left
持续时间窗口与 right
持续时间窗口相交时,就会生成一个输出。虽然 right
持续时间窗口的有效长度为零,但这并不妨碍它们与 left
持续时间窗口相交(因为 left
窗口永不关闭)。因此, right
序列的第一个项(如 A
)的(零长度)窗口会与 left
序列的两个项(如 0 和 1)的窗口相交,导致 Join
生成两个结果( 0,A
和 1,A
)。图中我将这两个结果垂直堆叠,表示它们几乎同时发生。当然,根据 IObserver<T>
的规则,它们实际上无法真正同时触发: Join
必须等待消费者处理完 0,A
的 OnNext
后,才能继续生成 1,A
。但每当一个源的事件与另一个源的多个窗口相交时, Join
会尽快生成所有配对的输出。
If I also immediately closed the left window by returning Observable.Empty<Unit>
, or perhaps Observable.Return(0)
, the windows would never overlap, so no pairs would ever get produced. (In theory if both left and right produce items at exactly the same time, then perhaps we might get a pair, but since the timing of events is never absolutely precise, it would be a bad idea to design a system that depended on this.)
若我们通过返回 Observable.Empty<Unit>
或 Observable.Return(0)
使左窗口也立即关闭,那么左、右窗口将永远无法重叠,因此不会生成任何配对结果。(理论上,如果 left
、 right
序列的项完全同时生成,则可能产生配对;但由于事件的时间精度无法绝对保证,依赖这种巧合设计系统是极不可取的。)
What if I wanted to ensure that items from right
only ever intersected with a single value from left
? In that case, I'd need to ensure that the left durations did not overlap. One way to do that would be to have my leftDurationSelector
always return the same sequence that I passed as the left
sequence. This will result in Join
making multiple subscriptions to the same source, and for some kinds of sources that might introduce unwanted side effects, but the Publish
and RefCount
operators provide a way to deal with that, so this is in fact a reasonably strategy. If we do that, the results look more like this.
若需要确保 right
序列的每个项仅与 left
序列的一个值相交,则需保证 left
序列的各个窗口互不重叠。一种方法是让 leftDurationSelector
始终返回 left
序列本身。尽管这会导致 Join
对同一源进行多次订阅(可能引发副作用),但结合使用 Publish
和 RefCount
运算符可以解决此问题。如果我们这样做,结果看起来更像这样。
The last example is very similar to CombineLatest
, except that it is only producing a pair when the right sequence changes. We can easily make it work the same way by changing the right durations to work in the same way as the left durations. This code shows how (including the use of Publish
and RefCount
to ensure that we only get a single subscription to the underlying left
and right
sources despite providing then to Join
many times over).
最后一个示例与 CombineLatest
非常相似,区别在于它仅在 right
序列变化时生成配对。若要让其行为与 CombineLatest
完全一致,只需让 right
序列的持续时间窗口采用与 left
序列相同的设置。以下代码展示了如何实现(包括使用 Publish
和 RefCount
确保即使多次向 Join
提供 left
和 right
序列,也只会对底层数据源进行一次订阅):
public static IObservable<TResult> MyCombineLatest<TLeft, TRight, TResult> ( IObservable<TLeft> left, IObservable<TRight> right, Func<TLeft, TRight, TResult> resultSelector ) { var refcountedLeft = left.Publish().RefCount(); var refcountedRight = right.Publish().RefCount(); return Observable.Join( refcountedLeft, refcountedRight, value => refcountedLeft, value => refcountedRight, resultSelector); }
Obviously there's no need to write this—you can just use the built-in CombineLatest
. (And that will be slightly more efficient because it has a specialized implementation.) But it shows that Join
is a powerful operator.
显然,我们无需实际编写这样的代码——直接使用内置的 CombineLatest
即可。(由于 CombineLatest
有专门的优化实现,其效率也会略高。)但这个例子展示了 Join
运算符的强大灵活性。
GroupJoin
When the Join
operator pairs up values whose windows overlap, it will pass the scalar values left and right to the resultSelector
. The GroupJoin
operator is based on the same concept of overlapping windows, but its selector works slightly differently: GroupJoin
still passes a single (scalar) value from the left source, but it passes an IObservable<TRight>
as the second argument. This argument represents all of the values from the right sequence that occur within the window for the particular left value for which it was invoked.
GroupJoin
运算符基于相同的窗口重叠概念,但其选择器( resultSelector
)的工作方式略有不同: GroupJoin
仍会传递来自 left
序列的单个标量值,但第二个参数是一个 IObservable<TRight>
类型的可观察序列。此参数表示在调用该选择器时,特定 left
值的时间窗口期内, right
序列中出现的所有值。
So this lacks the symmetry of Join
, because the left and right sources are handled differently. GroupJoin
will call the resultSelector
exactly once for each item produced by the left
source. When a left value's window overlaps with the windows of multiple right values, Group
would deal with that by calling the selector once for each such pairing, but GroupJoin
deals with this by having the observable passed as the second argument to resultSelector
emit each of the right items that overlap with that left item. (If a left item overlaps with nothing from the right, resultSelector
will still be called with that item, it'll just be passed an IObservable<TRight>
that doesn't produce any items.)
因此,这与 Join 的对称性不同,因为 left
和 right
数据源的处理方式不同。 GroupJoin
会为 left
源生成的每个项只调用一次resultSelector。当 left
值的时间窗口与多个 right
值的时间窗口重叠时:
Group
运算会通过为每个这样的配对组合调用一次选择器来处理;- 而
GroupJoin
的处理方式是:通过传递给resultSelector
第二个参数的可观察对象(IObservable<TRight>
),发射所有与该left
项重叠的right
项。
(如果左项没有与任何右项重叠,resultSelector仍会被调用,只是传递的IObservable<TRight>不会产生任何项。)
The GroupJoin
signature is very similar to Join
, but note the difference in the resultSelector
parameter.
GroupJoin
的方法签名与 Join
非常相似,但需注意两者的 resultSelector
参数存在差异。
public static IObservable<TResult> GroupJoin<TLeft, TRight, TLeftDuration, TRightDuration, TResult> ( this IObservable<TLeft> left, IObservable<TRight> right, Func<TLeft, IObservable<TLeftDuration>> leftDurationSelector, Func<TRight, IObservable<TRightDuration>> rightDurationSelector, Func<TLeft, IObservable<TRight>, TResult> resultSelector )
If we went back to our first Join
example where we had
如果我们回到之前第一个 Join
示例(即我们曾演示过...)
- the
left
producing values twice as fast as the right,left
数据源产生值的速度是right
数据源的两倍, - the left never expiring
left
数据源的值永远不会过期, - the right immediately expiring
right
数据源的值立即过期
This diagram shows those same inputs again, and also shows the observables GroupJoin
would pass to the resultSelector
for each of the items produced by left
:
该图表再次展示了这些相同的输入,并显示了 GroupJoin
针对 left
源生成的每个项会传递给 resultSelector
的可观察对象。
This produces events corresponding to all of the same events that Join
produced, they're just distributed across six different IObservable<TRight>
sources. It may have occurred to you that with GroupJoin
you could effectively re-create your own Join
method by doing something like this:
这将生成与Join所产生的事件相对应的事件,只是这些事件被分布到了六个不同的 IObservable<TRight>
源中。您可能已经想到,通过使用 GroupJoin
,实际上可以通过类似以下方式有效地重新实现自己的 Join
方法:
public IObservable<TResult> MyJoin<TLeft, TRight, TLeftDuration, TRightDuration, TResult>( IObservable<TLeft> left, IObservable<TRight> right, Func<TLeft, IObservable<TLeftDuration>> leftDurationSelector, Func<TRight, IObservable<TRightDuration>> rightDurationSelector, Func<TLeft, TRight, TResult> resultSelector) { return Observable.GroupJoin ( left, right, leftDurationSelector, rightDurationSelector, (leftValue, rightValues) => rightValues.Select(rightValue=>resultSelector(leftValue, rightValue)) ) .Merge(); }
You could even create a crude version of Window
with this code:
您甚至可以通过以下代码创建一个简化的 Window
方法版本:
public IObservable<IObservable<T>> MyWindow<T>(IObservable<T> source, TimeSpan windowPeriod) { return Observable.Create<IObservable<T>>(o => { var sharedSource = source .Publish() .RefCount(); var intervals = Observable.Return(0L) .Concat(Observable.Interval(windowPeriod)) .TakeUntil(sharedSource.TakeLast(1)) .Publish() .RefCount(); return intervals.GroupJoin( sharedSource, _ => intervals, _ => Observable.Empty<Unit>(), (left, sourceValues) => sourceValues) .Subscribe(o); }); }
Rx delivers yet another way to query data in motion by allowing you to interrogate sequences of coincidence. This enables you to solve the intrinsically complex problem of managing state and concurrency while performing matching from multiple sources. By encapsulating these low level operations, you are able to leverage Rx to design your software in an expressive and testable fashion. Using the Rx operators as building blocks, your code effectively becomes a composition of many simple operators. This allows the complexity of the domain code to be the focus, not the otherwise incidental supporting code.
Rx通过允许您查询关联事件序列,提供了另一种处理动态数据的方式。这使您能够解决管理状态与并发这一本质复杂的难题,同时实现对多源事件的匹配。通过封装这些底层操作,您可利用Rx以声明式且易于测试的方式设计软件。将Rx运算符作为构建块,您的代码实质上成为多个简单运算符的组合。这使得领域逻辑的复杂性成为关注焦点,而非那些原本繁琐的辅助代码。
And-Then-When
Zip
can take only two sequences as an input. If that is a problem, then you can use a combination of the three And
/Then
/When
methods. These methods are used slightly differently from most of the other Rx methods. Out of these three, And
is the only extension method to IObservable<T>
. Unlike most Rx operators, it does not return a sequence; instead, it returns the mysterious type Pattern<T1, T2>
. The Pattern<T1, T2>
type is public (obviously), but all of its properties are internal. The only two (useful) things you can do with a Pattern<T1, T2>
are invoking its And
or Then
methods. The And
method called on the Pattern<T1, T2>
returns a Pattern<T1, T2, T3>
. On that type, you will also find the And
and Then
methods. The generic Pattern
types are there to allow you to chain multiple And
methods together, each one extending the generic type parameter list by one. You then bring them all together with the Then
method overloads. The Then
methods return you a Plan
type. Finally, you pass this Plan
to the Observable.When
method in order to create your sequence.
Zip
方法只能接受两个序列作为输入。如果这成为一个问题,你可以使用 And
/Then
/When
这三个方法的组合。这些方法的使用方式与大多数其他 Rx 方法略有不同。在这三个方法中, And
是唯一一个对 IObservable<T>
的扩展方法。与大多数 Rx 运算符不同,它不返回序列,而是返回一个神秘的类型 Pattern<T1, T2>
。 Pattern<T1, T2>
类型是公开的(显然),但其所有属性都是内部的。对于 Pattern<T1, T2>
,你唯一能做的两件(有用)事情是调用它的 And
或 Then
方法。
在 Pattern<T1, T2>
上调用 And
方法会返回一个 Pattern<T1, T2, T3>
。在该类型上,你同样会找到 And
和 Then
方法。这些泛型 Pattern
类型的存在允许你将多个 And
方法链式调用,每个 And
调用会将泛型类型参数列表扩展一个类型参数。然后你可以通过 Then
方法的重载将它们全部组合起来。 Then
方法会返回一个 Plan
类型。最后,你需要将这个 Plan
传递给 Observable.When
方法以创建你的序列。
It may sound very complex, but comparing some code samples should make it easier to understand. It will also allow you to see which style you prefer to use.
这听起来可能很复杂,但是通过比较一些代码示例应该会让它更容易理解。同时,这也将让你看到自己更喜欢使用哪种风格。
To Zip
three sequences together, you can either use Zip
methods chained together like this:
要将三个序列合并在一起,你可以使用链式调用的 Zip
方法,如下所示:
IObservable<long> one = Observable.Interval(TimeSpan.FromSeconds(1)).Take(5); IObservable<long> two = Observable.Interval(TimeSpan.FromMilliseconds(250)).Take(10); IObservable<long> three = Observable.Interval(TimeSpan.FromMilliseconds(150)).Take(14); // lhs represents 'Left Hand Side' // rhs represents 'Right Hand Side' IObservable<(long One, long Two, long Three)> zippedSequence = one .Zip(two, (lhs, rhs) => (One: lhs, Two: rhs)) .Zip(three, (lhs, rhs) => (lhs.One, lhs.Two, Three: rhs)); zippedSequence.Subscribe( v => Console.WriteLine($"One: {v.One}, Two: {v.Two}, Three: {v.Three}"), () => Console.WriteLine("Completed"));
Or perhaps use the nicer syntax of the And
/Then
/When
:
或者,你也可以使用更优雅的 And
/Then
/When
语法:
Pattern<long, long, long> pattern = one.And(two).And(three); Plan<(long One, long Two, long Three)> plan = pattern.Then((first, second, third) => (One: first, Two: second, Three: third)); IObservable<(long One, long Two, long Three)> zippedSequence = Observable.When(plan); zippedSequence.Subscribe( v => Console.WriteLine($"One: {v.One}, Two: {v.Two}, Three: {v.Three}"), () => Console.WriteLine("Completed"));
This can be further reduced, if you prefer, to:
如果你愿意,这还可以进一步简化为:
IObservable<(long One, long Two, long Three)> zippedSequence = Observable.When( one.And(two).And(three) .Then((first, second, third) => (One: first, Two: second, Three: third)) ); zippedSequence.Subscribe( v => Console.WriteLine($"One: {v.One}, Two: {v.Two}, Three: {v.Three}"), () => Console.WriteLine("Completed"));
The And
/Then
/When
trio has more overloads that enable you to group an even greater number of sequences. They also allow you to provide more than one 'plan' (the output of the Then
method). This gives you the Merge
feature but on the collection of 'plans'. I would suggest playing around with them if this functionality is of interest to you. The verbosity of enumerating all of the combinations of these methods would be of low value. You will get far more value out of using them and discovering for yourself.
And
/Then
/When
这组方法提供了更多重载选项,使您能够对更多数量的序列进行组合。它们还允许您定义多个"plan"(即 Then
方法的输出结果)。这相当于为您提供了对多个"plans"集合进行合并( Merge
)的功能。如果对此功能感兴趣,建议您在实际使用中多加尝试。详细列举这些方法的所有组合形式意义不大——与其纸上谈兵,不如在实践中亲自探索,您将从中获得更大的收获。
Summary 总结
This chapter covered a set of methods that allow us to combine observable sequences. This brings us to a close on Part 2. We've looked at the operators that are mostly concerned with defining the computations we want to perform on the data. In Part 3 we will move onto practical concerns such as managing scheduling, side effects, and error handling.
本章我们探讨了一系列用于组合可观察序列的方法。至此,我们完成了第二部分的全部内容。我们研究了主要关注定义数据处理逻辑的运算符。在第三部分中,我们将转向实际应用问题,例如调度管理、副作用处理及错误处理机制。
