-NET-并发指南-全-
.NET 并发指南(全)
原文:Concurrency in .NET
译者:飞龙
第一部分
适用于并发程序的功能编程的好处
函数式编程是一种关注抽象和组合的编程范式。在这前三章中,你将学习如何将计算视为表达式的评估,以避免数据的突变。为了增强并发编程,函数式范式提供了工具和技术来编写确定性程序。输出仅取决于输入,而不是程序在执行时的状态。函数式范式还通过强调纯函数方面之间的关注点分离、隔离副作用和控制不良行为,来促进编写更少错误的代码。
这一部分介绍了适用于并发程序的功能编程的主要概念和好处。讨论的概念包括使用纯函数编程、不可变性、惰性和组合。
1
函数式并发基础
本章涵盖
-
为什么你需要并发
-
并发、并行和多线程之间的区别
-
避免编写并发应用程序时的常见陷阱
-
在线程之间共享变量
-
使用函数式范式开发并发程序
在过去,软件开发者有信心,随着时间的推移,他们的程序会运行得比以往任何时候都快。由于每一代硬件的改进,这一预测在多年中得到了证实。
在过去的 50 年里,硬件行业经历了不间断的改进。在 2005 年之前,处理器的发展持续带来更快的单核 CPU,直到最终达到戈登·摩尔预测的 CPU 速度极限。摩尔是一位计算机科学家,他在 1965 年预测,在达到技术无法再进步的最大速度之前,晶体管的密度和速度每 18 个月会翻倍。对 CPU 速度增加的原始预测假设了 10 年的速度翻倍趋势。摩尔预测,即摩尔定律,是正确的——除了进步持续了近 50 年(远远超过了他的估计)。
今天,单核 CPU 的速度几乎达到了光速,同时由于能量耗散产生了巨大的热量;这种热量是进一步改进的限制因素。
摩尔关于晶体管速度的预测已经实现(晶体管无法运行得更快)但它并没有死亡(现代晶体管的密度正在增加,在最高速度的范围内提供了并行性的机会)。多核架构和并行编程模型的结合使摩尔定律得以延续!随着 CPU 单核性能的改进停滞,开发者通过转向多核架构并开发支持并集成并发的软件来适应。
处理器革命已经开始了。多核处理器设计的新趋势将并行编程带入了主流。多核处理器架构提供了更高效计算的可能性,但所有这些力量都需要开发者进行额外的工作。如果程序员想要在代码中获得更多性能,他们必须适应新的设计模式以最大化硬件利用率,通过并行性和并发性利用多个核心。
在本章中,我们将通过检查其一些好处和编写传统并发程序所面临的挑战来探讨并发的一般信息。接下来,我们将介绍函数式范式概念,这些概念通过使用简单且易于维护的代码来克服传统限制。到本章结束时,您将理解为什么并发是一个值得的程序模型,以及为什么函数式范式是编写正确并发程序的正确工具。
1.1 您将从本书中学到的内容
在这本书中,我将探讨在传统编程范式下编写并发多线程应用程序的考虑和挑战。我将探讨如何通过函数式范式成功解决这些挑战并避免并发陷阱。接下来,我将介绍使用函数式编程中的抽象来创建声明式、易于实现和高度并发的程序的好处。在本书的整个过程中,我们将检查复杂并发问题,提供在.NET 中使用函数式范式构建并发和可扩展程序的最佳实践见解。您将熟悉函数式编程如何通过鼓励不可变数据结构来帮助开发者支持并发,这些数据结构可以在线程之间传递而无需担心共享状态,同时避免副作用。到本书结束时,您将掌握如何在 C#和 F#语言中编写更模块化、可读性和可维护的代码。在编写程序时,您将更加高效和熟练,这些程序以更少的代码行数达到峰值性能。最终,凭借您新获得的能力,您将拥有成为交付成功高性能解决方案专家所需的知识。
您将学习以下内容:
-
如何将异步操作与任务并行库结合使用
-
如何避免常见问题并调试您的多线程和异步应用程序
-
了解采用函数式范式的并发编程模型(函数式、异步、事件驱动以及通过代理和演员的消息传递)
-
如何使用函数式范式构建高性能、并发的系统
-
如何以声明式风格表达和组合异步计算
-
如何通过使用数据并行编程以纯方式无缝加速顺序程序
-
如何使用 Rx 风格的流式事件声明式地实现反应式和基于事件的程序
-
如何使用功能并发集合构建无锁的线程安全程序
-
如何编写可扩展、性能良好且健壮的服务器端应用程序
-
如何使用并发编程模式解决问题,例如 Fork/Join、并行聚合和分而治之技术
-
如何使用并行流和并行 Map/Reduce 实现处理大量数据集
本书假设您已具备通用编程知识,但不是函数式编程。要在您的编码中应用函数式并发,您只需要函数式编程概念的一个子集,我将在过程中解释您需要了解的内容。这样,您将在更短的学习曲线上获得函数式并发的许多好处,专注于您在日常编码体验中可以立即使用的内容。
1.2 让我们从术语开始
本节定义了与本书主题相关的术语,因此我们从共同基础开始。在计算机编程中,一些术语(如并发、并行性和多线程)在相同语境中使用,但含义不同。由于它们的相似性,将这些术语视为同一事物的倾向很常见,但这是不正确的。当需要推理程序的行为时,区分计算机编程术语至关重要。例如,根据定义,并发是多线程,但多线程不一定是并发的。您可以轻松地将多核 CPU 的功能像单核 CPU 一样使用,但反之则不然。
本节旨在就本书主题相关的定义和术语建立共同基础。在本节结束时,您将了解这些术语的含义:
-
顺序编程
-
并发编程
-
并行编程
-
多任务
-
多线程
1.2.1 顺序编程一次执行一个任务
顺序 编程 是按步骤完成事情的行为。让我们考虑一个简单的例子,比如在当地咖啡馆点一杯卡布奇诺。您首先排队,向唯一的咖啡师下单。咖啡师负责接单和提供饮料;此外,他们一次只能制作一杯饮料,因此您在下单前必须耐心等待——或者不耐心——在队列中。制作卡布奇诺涉及磨咖啡,煮咖啡,加热牛奶,打奶泡,以及将咖啡和牛奶混合,因此在您得到卡布奇诺之前会花费更多时间。图 1.1 展示了这一过程。

图 1.1 对于排队的每个人,咖啡师会依次重复相同的指令集(磨咖啡,煮咖啡,加热牛奶,打奶泡,将咖啡和牛奶混合制成卡布奇诺)。
图 1.1 是顺序工作的一个例子,其中必须完成一个任务才能开始下一个任务。这是一个方便的方法,有一套明确的系统(逐步)指令,说明何时做什么。在这个例子中,咖啡师在准备卡布奇诺时不太可能感到困惑或出错,因为步骤清晰且有序。逐步准备卡布奇诺的缺点是咖啡师在过程中的某些部分必须等待。在等待咖啡磨碎或牛奶打泡时,咖啡师实际上是无效的(阻塞)。同样的概念也适用于顺序和并发编程模型。如图 1.2 所示,顺序编程涉及按顺序、逐步有序执行的过程,每次只按线性方式执行一条指令。

图 1.2 典型的顺序编码,涉及按顺序、逐步有序执行的过程
在命令式和面向对象编程(OOP)中,我们倾向于编写按顺序执行代码,所有注意力和资源都集中在当前运行的任务上。我们通过执行一系列有序的语句来模拟和执行程序,一个接一个。
1.2.2 并发编程同时运行多个任务
假设咖啡师更喜欢同时启动多个步骤并并发执行它们?这将使顾客队伍移动得更快(并且,相应地,增加赚取的小费)。例如,一旦咖啡磨好,咖啡师就可以开始冲泡浓缩咖啡。在冲泡过程中,咖啡师可以接受新的订单或开始蒸汽和打泡牛奶的过程。在这种情况下,咖啡师给人一种同时进行多个操作(多任务处理)的印象,但这只是一种错觉。关于多任务处理的更多细节将在 1.2.4 节中介绍。实际上,因为咖啡师只有一个浓缩咖啡机,他们必须停止一个任务才能开始或继续另一个任务,这意味着咖啡师一次只能执行一个任务,如图 1.3 所示。在现代多核计算机中,这是一种资源的浪费。

图 1.3 咖啡师在准备咖啡(磨豆和冲泡)和准备牛奶(蒸汽和打泡)的操作之间切换(多任务处理)。因此,咖啡师以交错的方式执行多个任务的片段,给人一种多任务处理的错觉。但由于共享公共资源,每次只能执行一个操作。
并发描述了同时运行多个程序或程序多个部分的能力。在计算机编程中,在应用程序中使用并发提供了实际的多任务处理,将应用程序划分为多个独立的过程,这些过程可以在不同的线程中同时(并发)运行。这可以在单个 CPU 核心中发生,也可以在多个 CPU 核心可用时并行发生。通过异步或并行执行任务,可以提高程序的吞吐量(CPU 处理计算的速率)和响应性。例如,流式传输视频内容的应用程序是并发的,因为它同时从网络读取数字数据,解压缩它,并更新屏幕上的展示。
并发给人一种这些线程正在并行运行的印象,不同的程序部分可以同时运行。但在单核环境中,一个线程的执行会暂时暂停并切换到另一个线程,就像图 1.3 中的咖啡师一样。如果咖啡师希望通过同时执行多个任务来加快生产,那么必须增加可用资源。在计算机编程中,这个过程被称为并行化。
1.2.3 并行编程同时执行多个任务
从开发者的角度来看,当我们考虑“我的程序如何同时执行多个任务?”或“我的程序如何更快地解决问题?”这样的问题时,我们会想到并行化。并行化是指同时执行多个任务的概念,字面上是在不同的核心上同时执行,以提高应用程序的速度。尽管所有并行程序都是并发的,但我们已经看到并非所有并发都是并行。这是因为并行化依赖于实际的运行时环境,并且需要硬件支持(多个核心)。只有在多核设备上才能实现并行化(图 1.4),这是提高程序性能和吞吐量的手段。

图 1.4 只有多核机器允许并行化,以同时执行不同的任务。在这个图中,每个核心都在执行一个独立任务。
回到咖啡店示例,假设你是经理,希望通过加快饮品制作速度来减少顾客的等待时间。一个直观的解决方案是雇佣第二名咖啡师并设置第二个咖啡站。当两名咖啡师同时工作时,顾客的队伍可以独立且并行地处理,卡布奇诺的制作(图 1.5)也会加快。

图 1.5 由于有两名咖啡师可以在两个咖啡站上并行工作,卡布奇诺的制作速度更快。
生产过程中没有中断可以带来性能上的好处。并行性的目标是最大化使用所有可用的计算资源;在这种情况下,两个咖啡师在各自的站台上并行工作(多核处理)。
当一个任务被分割成多个独立的子任务时,可以通过使用所有可用的核心来实现并行性。在图 1.5 中,一个多核机器(两个咖啡站)允许并行执行不同的任务(两个忙碌的咖啡师)而不会中断。
时间概念对于并行执行操作是基本的。在这样的程序中,如果操作可以并行执行,则这些操作是并发的;如果执行在时间上重叠(参见图 1.6),则这些操作是并行的。

图 1.6 并行计算是一种计算类型,其中许多计算是同时进行的,其原理是大型问题通常可以被分解成更小的部分,这些部分随后同时解决。
并行性和并发性是相关的编程模型。并行程序也是并发的,但并发程序并不总是并行的,因为并行编程是并发编程的一个子集。虽然并发性指的是系统的设计,但并行性则与执行相关。并发和并行编程模型直接与它们执行时的本地硬件环境相关联。
1.2.4 多任务在时间上同时执行多个任务
多任务处理是指在一段时间内通过并发执行多个任务的概念。我们熟悉这个想法,因为我们每天都在日常生活中进行多任务处理。例如,当我们等待咖啡师准备我们的卡布奇诺时,我们使用智能手机检查电子邮件或浏览新闻故事。我们一次做两件事:等待和使用智能手机。
计算机多任务处理是在计算机只有一个 CPU 的时代设计的,以在共享相同的计算资源的同时并发执行许多任务。最初,只能通过 CPU 的时间切片来同时执行一个任务。(时间切片指的是一种复杂的调度逻辑,它协调多个线程之间的执行。)调度允许一个线程在调度另一个线程之前运行的时长被称为线程量子。CPU 通过时间切片,使得每个线程在执行上下文切换到另一个线程之前都能执行一个操作。上下文切换是由操作系统处理的多任务处理程序,以优化性能(图 1.7)。但在单核计算机中,多任务处理可能会通过引入线程间上下文切换的额外开销来降低程序的性能。

图 1.7 每个任务都有不同的阴影,表示在单核机器上的上下文切换给人一种多个任务并行运行的错觉,但实际上每次只处理一个任务。
有两种类型的多任务操作系统:
- 协作 多任务系统,其中调度器允许每个任务运行直到完成或显式地将执行控制权交回给调度器
** 抢占式 多任务系统(如微软 Windows),*其中调度器优先执行任务,底层系统根据任务的优先级,在时间分配完成后通过将控制权交给其他任务来切换执行顺序**
**过去十年中设计的多数操作系统都提供了抢占式多任务。多任务对于 UI 响应性很有用,有助于在长时间操作期间避免 UI 冻结。
1.2.5 多线程用于性能提升
多线程 是多任务概念的扩展,旨在通过最大化优化计算机资源来提高程序性能。多线程是使用多个执行线程的并发形式。多线程意味着并发,但并发不一定意味着多线程。多线程使应用程序能够显式地将特定任务细分到单个线程中,这些线程在同一个进程中并行运行。
线程 是一个计算单元(一组旨在实现特定结果的独立编程指令),操作系统调度器独立执行和管理。多线程与多任务不同:与多任务不同,多线程中的线程共享资源。但这种“资源共享”的设计比多任务更具编程挑战性。我们将在本章的 1.4.1 节中讨论线程间共享变量的问题。
并行编程和多线程编程的概念密切相关。但与并行性相比,多线程是硬件无关的,这意味着它可以在不考虑核心数量的情况下执行。并行编程是多线程的超集。你可以通过在同一个进程中共享资源来使用多线程并行化一个程序,例如,但你也可以通过在多个进程甚至在不同的计算机上执行计算来并行化一个程序。图 1.8 展示了这些术语之间的关系。

图 1.8 单核和多核设备中并发、并行、多线程和多任务之间的关系
总结如下:
-
顺序编程 指的是在单个 CPU 上依次执行的一组有序指令。
-
并发编程 同时处理多个操作,不需要硬件支持(使用一个或多个核心)。
-
并行编程在多个 CPU 上同时执行多个操作。所有并行程序都是并发的,同时运行,但并非所有并发都是并行。原因是并行性只能在多核设备上实现。
-
多任务处理同时执行来自不同进程的多个线程。多任务处理并不一定意味着并行执行,只有在使用多个 CPU 时才能实现并行执行。
-
多线程扩展了多任务处理的概念;它是一种使用来自同一进程的多个独立执行线程的并发形式。每个线程可以并发或并行运行,这取决于硬件支持。
1.3 为什么需要并发?
并发是生活的一部分自然属性——作为人类,我们习惯于多任务处理。我们可以在喝咖啡的同时阅读电子邮件,或者在我们最喜欢的歌曲中打字。在应用程序中使用并发的最主要原因是提高性能和响应速度,以及实现低延迟。常识告诉我们,如果一个人一个接一个地完成两个任务,所需时间会比两个人同时完成这两个任务要长。
应用程序也是如此。问题是大多数应用程序并没有编写成平均分配所需任务到可用的 CPU 上。计算机被用于许多不同的领域,如分析、金融、科学和医疗保健。每年分析的数据量都在增加。两个很好的例子是谷歌和皮克斯。
2012 年,谷歌每分钟接收超过 200 万个搜索查询;到 2014 年,这个数字翻了一番多。1995 年,皮克斯制作了第一部完全由计算机生成的电影,《玩具总动员》。在计算机动画中,每个图像都需要渲染无数细节和信息,例如阴影和照明。所有这些信息都以每秒 24 帧的速度变化。在 3D 电影中,需要指数级增加变化信息。
《玩具总动员》的制作者使用了 100 台连接的双处理器机器来制作他们的电影,并行计算的使用是不可或缺的。皮克斯的工具在《玩具总动员 2》中得到了进化;该公司使用了 1400 个计算机处理器进行数字电影编辑,从而大大提高了数字质量和编辑时间。2000 年初,皮克斯的计算机能力进一步增加,达到 3500 个处理器。十六年后,用于处理全动画电影的计算机能力达到了荒谬的 24000 个核心。对并行计算的需求继续呈指数级增长。
让我们考虑一个有N(任何数字)个运行核心的处理器。在单线程应用程序中,只有一个核心运行。同一个应用程序执行多个线程将会更快,随着对性能的需求增长,对N的需求也会增长,使得并行程序成为未来标准编程模型的选择。
如果在一个未考虑并发设计的多核机器上运行应用程序,您正在浪费计算机的生产力,因为应用程序在按顺序通过进程时只会使用部分可用的计算机功率。在这种情况下,如果您打开任务管理器或任何 CPU 性能计数器,您会注意到只有一个核心在高负荷运行,可能达到 100%,而所有其他核心都未充分利用或空闲。在一个具有八个核心的机器上运行非并发程序意味着资源的使用率可能低至 15%(图 1.9)。

图 1.9 Windows 任务管理器显示一个程序未能充分利用 CPU 资源。
这种计算能力的浪费明确表明,顺序代码不是多核处理器的正确编程模型。为了最大限度地利用可用的计算资源,微软的.NET 平台通过多线程提供代码的并行执行。通过使用并行性,程序可以充分利用可用的资源,如图 1.10 中的 CPU 性能计数器所示,您会注意到所有处理器核心都在高负荷运行,可能达到 100%。当前的硬件趋势预测将会有更多的核心而不是更快的时钟速度;因此,开发者别无选择,只能拥抱这一演变,成为并行程序员。

图 1.10 考虑到并发编写的程序可以最大化 CPU 资源,可能高达 100%。
1.3.1 并发编程的现在与未来
掌握并发编程以交付可扩展程序已成为一项必备技能。公司对招聘和投资那些对编写并发代码有深厚知识的工程师感兴趣。实际上,编写正确的并行计算可以节省时间和金钱。使用较少的服务器即可构建可扩展程序,并利用可用的计算资源,这比不断购买和添加未充分利用的昂贵硬件以达到相同性能水平要便宜得多。此外,更多的硬件需要更多的维护和电力来运行。
这是一个学习编写多线程代码的激动人心的时刻,使用函数式编程(FP)方法提高程序性能是令人满意的。函数式编程是一种将计算视为表达式的评估,并避免改变状态和可变数据的编程风格。由于不可变性是默认的,加上出色的组合和声明式编程风格,FP 使得编写并发程序变得轻而易举。更多细节将在第 1.5 节中介绍。
虽然在新的范式下思考可能会让人有些不安,但学习并行编程的初始挑战很快就会减少,坚持不懈的回报是无限的。你会发现打开 Windows 任务管理器并自豪地注意到在代码更改后 CPU 使用率激增是一种神奇而壮观的事情。一旦你熟悉并习惯于使用函数式范式编写高度可扩展的系统,就很难回到缓慢的顺序代码风格。
并发是计算机行业即将到来的下一个创新,它将改变开发者编写软件的方式。随着行业对软件需求的演变和对通过非阻塞 UI 提供卓越用户体验的高性能软件的需求,对并发的需求将持续增加。与硬件发展方向一致,显然并发和并行性是编程的未来。
1.4 并发编程的陷阱
并发和并行编程无疑对快速响应和给定计算的快速执行有益。但这种性能和反应性体验的收益是有代价的。使用顺序程序,代码的执行遵循可预测性和确定性的快乐路径。相反,多线程编程需要承诺和努力才能实现正确性。此外,由于我们习惯于按顺序思考,因此对同时运行的多个执行进行推理是困难的。
开发并行程序的过程不仅仅是创建和生成多个线程。编写并行执行的程序要求严格,需要深思熟虑的设计。你应该在设计时考虑以下问题:
-
如何使用并发和并行性达到令人难以置信的计算性能和高度响应的应用?
-
这样的程序如何充分利用多核计算机提供的强大功能?
-
如何在确保线程安全的同时协调线程之间的通信和对同一内存位置的访问?(当两个或多个线程同时尝试访问和修改数据或状态时,如果数据或状态没有损坏,则称该方法为线程安全。)
-
程序如何确保确定性执行?
-
如何在不危及最终结果质量的情况下并行化程序的执行?
这些问题并不容易回答。但某些模式和技巧可以帮助。例如,在存在副作用的情况下,^(1) 计算的可确定性会丧失,因为并发任务执行的顺序变得可变。明显的解决方案是避免副作用,转而使用纯函数。你将在本书的学习过程中了解这些技术和实践。
1.4.1 并发风险
编写并发程序并不容易,在程序设计过程中必须考虑许多复杂元素。创建新线程或在线程池中排队多个作业相对简单,但如何确保程序的正确性呢?当许多线程持续访问共享数据时,你必须考虑如何保护数据结构以确保其完整性。一个线程应该原子性地写入和修改内存位置,^(2) 而不受其他线程的干扰。现实是,使用命令式编程语言或具有可变值的变量(可变变量)编写的程序,无论使用何种级别的内存同步或并发库,都始终容易受到数据竞争的影响。
考虑两个线程(线程 1 和线程 2)并行运行的情况,它们都在尝试访问和修改共享值x,如图 1.11 所示。对于线程 1 来说,修改一个变量需要多个 CPU 指令:值必须从内存中读取,然后修改,最后写回内存。如果线程 2 在线程 1 写回更新值时尝试从同一内存位置读取,则x的值会改变。更准确地说,线程 1 和线程 2 可能同时读取x的值,然后线程 1 修改x的值并将其写回内存,而线程 2 也修改x的值。结果是数据损坏。这种现象称为竞争条件。

图 1.11 两个线程(线程 1 和线程 2)并行运行,都试图访问和修改共享值x。如果线程 2 在线程 1 写回更新值时尝试从同一内存位置读取,则x的值会改变。这种结果会导致数据损坏或竞争条件。
程序中可变状态和并行性的组合等同于问题。从命令式范式角度来看的解决方案是,通过锁定对多个线程的访问来保护可变状态。这种技术称为互斥,因为一个线程对给定内存位置的访问阻止了其他线程在该时刻的访问。时间概念是核心的,因为多个线程必须同时访问相同的数据才能从这种技术中受益。引入锁以同步多个线程对共享资源的访问,解决了数据损坏的问题,但引入了可能导致死锁的更多复杂性。
考虑图 1.12 中的情况,其中线程 1 和线程 2 正在等待对方完成工作,并在等待中无限期地阻塞。线程 1 获取锁 A,紧接着线程 2 获取锁 B。此时,两个线程都在等待一个永远不会释放的锁。这是一个死锁的例子。

图 1.12 在这个场景中,线程 1 获取锁 A,线程 2 获取锁 B。然后,线程 2 试图获取锁 A,而线程 1 试图获取已被线程 2 获取的锁 B,线程 2 正在等待获取锁 A 以释放锁 B。此时,两个线程都在等待永远不会释放的锁。这是一个死锁的例子。
这里有一份关于并发风险的简要解释列表。稍后,你将更详细地了解每个风险,并特别关注如何避免它们:
-
竞态条件是当共享可变资源(例如文件、图像、变量或集合)同时被多个线程访问时发生的状态。这会导致不一致的状态,后续的数据损坏使得程序不可靠且无法使用。
-
当多个线程共享需要同步技术的状态竞争时,性能下降是一个常见问题。互斥锁(或 mutex),正如其名所示,通过迫使多个线程停止工作以进行通信和同步内存访问来防止代码并行运行。锁的获取和释放伴随着性能惩罚,减慢了所有进程。随着核心数量的增加,锁竞争的成本可能会增加。随着更多任务被引入以共享相同的数据,与锁相关的开销可能会对计算产生负面影响。第 1.4.3 节展示了由于引入锁同步而产生的后果和开销成本。
-
死锁是源于使用锁的并发问题。它发生在存在任务循环的情况下,其中每个任务在等待另一个任务继续时被阻塞。由于所有任务都在等待另一个任务做某事,它们被无限期地阻塞。线程之间共享的资源越多,需要的锁就越多以避免竞态条件,死锁的风险就越高。
-
缺乏组合是源于代码中引入锁的设计问题。锁不能组合。组合鼓励通过将复杂问题分解成更小的、更容易解决的问题来解决问题,然后将它们重新粘合在一起。组合是函数式编程(FP)的一个基本原则。
1.4.2 状态演化的共享
实际程序需要任务之间的交互,例如交换信息以协调工作。没有共享所有任务都可以访问的数据,这是无法实现的。处理这种共享状态是大多数与并行编程相关问题的根源,除非共享数据是不可变的或每个任务都有自己的副本。解决方案是保护所有代码免受这些并发问题的影响。没有编译器或工具可以帮助你将原始同步锁定位在代码的正确位置。这完全取决于你作为程序员的技能。
由于这些潜在问题,编程社区已经大声疾呼,作为回应,已经编写并引入了库和框架到主流面向对象语言(如 C#和 Java)中,以提供并发保障,这些保障并非原始语言设计的一部分。这种支持是一种设计修正,体现在命令式和面向对象、通用编程环境中的共享内存的存在。同时,函数式语言不需要保障,因为函数式编程的概念很好地映射到并发编程模型中。
1.4.3 一个简单的现实世界示例:并行快速排序
排序算法通常用于技术计算,可能成为瓶颈。让我们考虑一个快速排序算法,^(3) 这是一个适用于并行化的 CPU 密集型计算,它对数组元素进行排序。这个例子旨在演示将顺序算法转换为并行版本的风险,并指出在做出任何决定之前,在代码中引入并行性需要额外的思考。否则,性能可能会产生与预期相反的结果。
快速排序是一种分而治之算法;它首先将一个大数组分为两个较小的子数组,一个是低元素子数组,另一个是高元素子数组。然后快速排序可以递归地对子数组进行排序,并且易于并行化。它可以在数组上原地操作,执行排序时只需要少量额外的内存。该算法由三个简单的步骤组成,如图 1.13 所示:
-
选择一个基准元素。
-
根据与基准的相对顺序将序列划分为子序列。
-
对子序列进行快速排序。

图 1.13 递归函数进行分而治之。每个块被分成相等的两半,其中基准元素必须是序列的中位数,直到每个代码部分都可以独立执行。当所有单个块都完成后,它们将结果发送回前一个调用者进行聚合。快速排序基于选择一个基准点并将序列划分为小于基准的子序列和大于基准的子序列元素,然后递归地对两个较小的序列进行排序的想法。
递归算法,尤其是基于分而治之形式的算法,是并行化和 CPU 密集型计算的优秀候选者。
微软任务并行库(TPL),在.NET 4.0 发布后推出,使得实现和利用此类算法的并行性变得更加容易。使用 TPL,你可以将算法的每个步骤分割,并递归地并行执行每个任务。这是一个直接且简单的实现,但你必须小心线程创建的深度,以避免添加不必要的任务。
要实现快速排序算法,你将使用 FP 语言 F#。然而,由于其固有的递归性质,这个实现背后的思想也可以应用于 C#,它需要一个具有可变状态的命令式 for 循环方法。C# 不支持 F# 那样的优化尾递归函数,因此在调用栈指针超过栈限制时,可能会引发堆栈溢出异常。在第三章中,我们将详细介绍如何克服这个 C# 限制。
列表 1.1 展示了一个采用分而治之策略的 F# 快速排序函数。对于每次递归迭代,你选择一个枢轴点并使用它来划分整个数组。你使用 List.partition API 在枢轴点周围划分元素,然后递归地对枢轴两侧的列表进行排序。F# 对数据结构操作有很好的内置支持。在这种情况下,你正在使用 List.partition API,它返回一个包含两个列表的元组:一个满足谓词的列表,另一个不满足谓词的列表。
列表 1.1 简单的快速排序算法
let rec quicksortSequential aList =
match aList with
| [] -> []
| firstElement :: restOfList ->
let smaller, larger =
List.partition (fun number -> number < firstElement) restOfList
quicksortSequential smaller @ (firstElement ::
➥ quicksortSequential larger)
在我的系统上(八个逻辑核心;2.2 GHz 时钟速度)运行这个快速排序算法,针对一个包含一百万个随机、未排序整数的数组,平均需要 6.5 秒。但是,当你分析这个算法设计时,并行化的机会是显而易见的。在 quicksortSequential 的末尾,你递归地调用 quicksortSequential,每个分区由 (fun number -> number < firstElement) restOfList 标识。通过使用 TPL 来生成新任务,你可以并行重写这部分代码。
列表 1.2 使用 TPL 的并行快速排序算法
let rec quicksortParallel aList =
match aList with
| [] -> []
| firstElement :: restOfList ->
let smaller, larger =
List.partition (fun number -> number < firstElement) restOfList
let left = Task.Run(fun () -> quicksortParallel smaller) ①
let right = Task.Run(fun () -> quicksortParallel larger) ①
left.Result @ (firstElement :: right.Result) ②
列表 1.2 中的算法正在并行运行,现在通过在所有可用核心之间分配工作,正在使用更多的 CPU 资源。但即使资源利用率有所提高,整体性能结果并没有达到预期。
执行时间反而增加了,而不是减少。并行化的快速排序算法从每次运行平均 6.5 秒增加到大约 12 秒。整体处理时间变慢了。在这种情况下,问题在于算法过度并行化。每次内部数组被划分时,都会生成两个新任务来并行化这个算法。这种设计相对于可用的核心产生了过多的任务,这导致了并行化开销。这在涉及并行化递归函数的分而治之算法中尤其如此。重要的是不要添加比必要的更多任务。令人失望的结果展示了并行化的重要特性:在如何增加额外的线程或处理来帮助特定的算法实现方面存在固有的限制。
为了实现更好的优化,你可以通过在某个点停止递归并行化来重构之前的 quicksortParallel 函数。这样,算法的前几次递归仍然会并行执行,直到最深的递归,然后会回退到串行方法。这种设计保证了充分利用核心。此外,并行化带来的开销也大大减少。
列表 1.3 展示了这种新的设计方法。它考虑了递归函数运行的层级;如果层级低于预定义的阈值,它将停止并行化。函数 quicksortParallelWithDepth 有一个额外的参数,depth,其目的是减少和控制递归函数并行化的次数。depth 参数在每次递归调用时递减,并创建新任务,直到该参数值达到零。在这种情况下,你为 max depth 传递了由 Math.Log(float System.Enviroment.ProcessorCount, 2.) + 4. 计算出的值。这确保了递归的每一层都将产生两个子任务,直到所有可用的核心都被征用。
列表 1.3 使用 TPL 的更好的并行 Quicksort 算法
let rec quicksortParallelWithDepth depth aList = ①
match aList with
| [] -> []
| firstElement :: restOfList ->
let smaller, larger =
List.partition (fun number -> number < firstElement) restOfList
if depth < 0 then ②
let left = quicksortParallelWithDepth depth smaller ③
let right = quicksortParallelWithDepth depth larger ③
left @ (firstElement :: right)
else
let left = Task.Run(fun () ->
➥ quicksortParallelWithDepth (depth - 1) smaller) ④
let right = Task.Run(fun () ->
➥ quicksortParallelWithDepth (depth - 1) larger) ④
left.Result @ (firstElement :: right.Result)
选择任务数量的一个相关因素是预测的任务运行时间将有多相似。在 quicksortParallelWithDepth 的情况下,任务的持续时间可能会有很大的变化,因为枢轴点依赖于未排序的数据。它们不一定导致大小相等的段。为了补偿任务大小的不均匀,本例中的公式计算 depth 参数以产生比核心更多的任务。该公式将任务的数量限制在大约是核心数量的 16 倍,因为任务的数量不能超过 2 ^ depth。我们的目标是使快速排序的工作负载平衡,并且不启动比所需的更多的任务。在每次迭代(递归)期间启动 Task,当达到深度层级时,会饱和处理器。
在大多数情况下,快速排序生成不平衡的工作负载,因为产生的片段大小不均。概念公式 log2(ProcessorCount) + 4计算出depth参数以限制和适应运行任务的数量,无论在何种情况下。4 如果你将depth = log2(ProcessorCount) + 4代入并简化表达式,你会发现任务的数量是ProcessorCount` 的 16 倍。通过测量递归深度来限制子任务的数量是一种极其重要的技术。5
例如,在四核机器的情况下,深度计算如下:
depth = log2(ProcessorCount) + 4
depth = log2(2) + 4
depth = 2 + 4
结果是大约 36 到 64 个并发任务的范围,因为在每次迭代中,每个分支都会启动两个任务,这些任务在每次迭代中都会翻倍。这样,线程之间的分区工作在各个核心之间有公平和合适的分配。
1.4.4 F# 中的基准测试
您使用了 F# REPL(读取-评估-打印循环)执行了 Quicksort 示例,这是一个方便的工具,可以运行代码的特定部分,因为它跳过了程序的编译步骤。REPL 在原型设计和数据分析开发中非常适合,因为它简化了编程过程。另一个好处是内置的 #time 功能,它可以切换性能信息的显示。当启用时,F# Interactive 测量每个解释和执行的代码段的实时、CPU 时间和垃圾回收信息。
表 1.1 对 3 GB 的数组进行排序,启用 64 位环境标志以避免大小限制。它在具有八个逻辑核心(四个物理核心带有超线程)的计算机上运行。在平均 10 次运行中,表 1.1 显示了执行时间(以秒为单位)。
表 1.1 Quicksort 排序基准测试
| 串行 | 并行 | 并行 4 线程 | 并行 8 线程 |
|---|---|---|---|
| 6.52 | 12.41 | 4.76 | 3.50 |
需要指出的是,对于小于 100 项的小数组,由于创建和/或生成新线程的开销,并行排序算法比串行版本慢。即使你正确编写了并行程序,并发构造函数引入的开销可能会压倒程序运行时间,从而降低性能,这与预期相反。因此,重要的是将原始串行代码作为基线进行基准测试,然后继续测量每次更改,以验证并行化是否有益。完整的策略应考虑这个因素,并且只有当数组大小大于一个阈值(递归深度)时才采用并行化,这个阈值通常与核心数相匹配,之后它默认回到串行行为。
1.5 为什么选择函数式编程进行并发?
问题在于,本质上所有有趣的并发应用都涉及到对共享状态的故意和受控的修改,例如屏幕空间、文件系统或程序的内部数据结构。因此,正确的解决方案是提供允许安全修改共享状态部分的机制。
—佩顿·琼斯、安德鲁·戈登和西格博恩·菲恩(《并发 Haskell》,第 23 届 ACM 程序设计语言原理研讨会论文集,圣彼得堡海滩,FL,1996 年 1 月)
FP 是关于最小化和控制副作用,通常被称为 纯函数式编程。FP 使用转换的概念,其中函数创建一个值 x 的副本,然后修改副本,而原始值 x 保持不变,可以由程序的其它部分自由使用。它鼓励在设计程序时考虑是否需要可变性和副作用。FP 允许可变性和副作用,但以战略性和明确的方式,通过利用封装它们的方法将这一区域与代码的其余部分隔离开来。
采用函数式范式的最主要原因是解决多核时代存在的问题。高度并发的应用程序,如网络服务器和数据分析数据库,遭受了几个架构问题。这些系统必须可扩展以响应大量并发请求,这导致了处理最大资源竞争和高调度频率的设计挑战。此外,竞争条件和死锁很常见,这使得代码的故障排除和调试变得困难。
在本章中,我们讨论了在命令式或面向对象编程中开发并发应用程序的一些常见问题。在这些编程范式中,我们处理对象作为基本构造。相反,在并发方面,当从单线程程序过渡到大规模并行化工作(这是一个具有挑战性和完全不同的场景)时,处理对象有一些需要注意的注意事项。
这些问题的传统解决方案是对资源访问进行同步,以避免线程之间的竞争。但这个解决方案是一把双刃剑,因为使用同步原语,如用于互斥的 lock,可能导致死锁或竞争条件。实际上,变量的状态(正如其名称 variable 所暗示的)可能会发生变化。在面向对象编程(OOP)中,变量通常代表一个随着时间的推移可能会发生变化的对象。正因为如此,你永远不能依赖其状态,因此,你必须检查其当前值以避免不希望的行为 (图 1.14)。

图 1.14 在函数式范式下,由于默认构造为不可变性,并发编程保证了确定性执行,即使在共享状态的情况下也是如此。相反,命令式和面向对象编程使用可变状态,这在多线程环境中难以管理,这导致了非确定性程序。
需要考虑的是,采用 FP 概念的系统组件不能再相互干扰,并且可以在不使用任何锁定策略的情况下在多线程环境中使用。
使用共享的可变变量和副作用函数开发安全的并行程序需要程序员付出大量努力,他们必须做出关键决策,这通常会导致以锁定形式出现的同步。通过通过函数式编程消除这些基本问题,你还可以消除这些特定于并发的并发问题。这就是为什么 FP 是一个出色的并发编程模型。它非常适合并发程序员,在高度多线程环境中使用简单代码实现正确的高性能。在 FP 的核心,变量和状态都是不可变的,不能共享,函数可能没有副作用。
FP 是编写并发程序最实用的方法。试图用命令式语言编写它们不仅困难,而且还会导致难以发现、复制和修复的错误。
你将如何利用你所能获得的每一个计算机核心?答案是简单的:拥抱函数式范式!
1.5.1 函数式编程的好处
即使你目前没有计划采用这种风格,学习函数式编程(FP)也有实际的好处。然而,如果不展示即时的好处,很难说服某人花时间学习新事物。好处以惯用语言特性形式出现,最初可能看起来令人不知所措。然而,FP 是一种范式,在经过一段学习曲线后,将赋予你强大的编码能力和程序中的积极影响。在几周内使用 FP 技术后,你会提高应用程序的可读性和正确性。
FP(侧重于并发)的好处包括以下内容:
-
不可变性 — 一种在创建后防止修改对象状态的性质。在 FP 中,变量赋值不是一个概念。一旦一个值与一个标识符相关联,它就不能改变。函数式代码按定义是不可变的。不可变对象可以在线程之间安全地传输,从而带来巨大的优化机会。不可变性消除了由于缺乏互斥而导致的内存损坏(竞争条件)和死锁问题。
-
纯函数 — 这没有副作用,这意味着函数不会改变函数体外的任何输入或数据。如果函数对用户来说是透明的,并且它们的返回值仅取决于输入参数,那么它们就被说成是纯函数。通过将相同的参数传递给纯函数,结果不会改变,每个过程都会返回相同的值,产生一致和预期的行为。
-
引用透明性 — 函数的输出仅依赖于其输入并映射到其输入的概念。换句话说,每次函数接收到相同的参数时,结果都是相同的。这个概念在并发编程中很有价值,因为表达式的定义可以被其值替换,并且具有相同的意义。引用透明性保证了可以以任何顺序和并行地评估一组函数,而不会改变应用程序的行为。
-
惰性求值 — 在函数式编程中用于按需检索函数的结果或延迟对大数据流的分析,直到需要时。
-
可组合性 — 用于组合函数并从简单的函数中创建更高级的抽象。可组合性是战胜复杂性的最有力的工具,让你能够定义和构建复杂问题的解决方案。
学习函数式编程可以使你编写更模块化、面向表达式和概念上简单的代码。这些函数式编程资产的组合将让你理解你的代码正在做什么,无论代码正在执行多少线程。
在本书的后面部分,你将学习应用并行性和绕过与可变状态和副作用相关问题的技术。函数式范式对这些概念的方法旨在通过声明式编程风格简化并最大化编码效率。
1.6 拥抱函数式范式
有时,改变是困难的。通常,在领域知识中感到舒适的开发者缺乏从不同角度看待编程问题的动力。学习任何新的程序范式都是困难的,并且需要时间来过渡到以不同风格进行开发。改变你的编程视角需要改变你的思维和方式,而不仅仅是学习新编程语言的新代码语法。
从 Java 这样的语言到 C#并不困难;在概念上,它们是相同的。从命令式范式到函数式范式的转变是一个更加困难的挑战。核心概念被替换。你不再有状态。你不再有变量。你不再有副作用。
但你为改变范式所付出的努力将带来巨大的回报。大多数开发者都会同意,学习一门新语言可以使你成为一名更好的开发者,并将这比作一位医生建议患者每天锻炼 30 分钟以保持健康。患者知道锻炼的真正益处,但同时也意识到日常锻炼意味着承诺和牺牲。
同样,学习一种新范式并不难,但确实需要奉献、参与和时间。我鼓励所有希望成为更好的程序员的人考虑学习函数式编程(FP)范式。学习 FP 就像乘坐过山车:在这个过程中,你会有感到兴奋和飘浮的时刻,然后会有你认为你理解了一个原则,但随后会急剧下降——尖叫——但这次旅行是值得的。将学习 FP 视为一段旅程,是对你个人和职业生涯的投资,并保证有回报。记住,学习的一部分是犯错误,并发展技能以避免将来再犯同样的错误。
在整个过程中,你应该识别难以理解的概念,并尝试克服这些困难。考虑如何在实践中使用这些抽象概念,从解决简单问题开始。我的经验表明,通过使用真实示例来找出一个概念的目的,你可以突破心理障碍。这本书将引导你了解将函数式编程(FP)应用于并发和分布式系统的益处。这是一条狭窄的道路,但在另一边,你将获得几个在日常生活中编程时可以使用的优秀基础概念。我坚信,你将获得解决复杂问题的新的见解,并利用函数式编程的巨大力量成为一名更优秀的软件工程师。
1.7 为什么选择 F#和 C#进行函数式并发编程?
这本书的焦点是开发和设计高度可扩展和性能优异的系统,采用函数式范式来编写正确的并发代码。这并不意味着你必须学习一门新语言;你可以通过使用你已经熟悉的工具来应用函数式范式,例如多用途语言 C#和 F#。多年来,这些语言已经添加了几个函数式特性,使你更容易转向采用这种新范式。
解决问题的本质不同方法是选择这些语言的原因。这两种编程语言都可以用非常不同的方式解决相同的问题,这为选择最适合工作的工具提供了论据。拥有一个全面的工具集,你可以设计出更好、更简单的解决方案。实际上,作为软件工程师,你应该将编程语言视为工具。
理想情况下,解决方案应该是由协同工作的 C#和 F#项目组合而成。这两种语言覆盖了不同的编程模型,但选择使用哪种工具来完成工作的选项,在生产力效率方面提供了巨大的好处。选择这些语言的另一个方面是它们对不同的并发编程模型的支持,可以混合使用。例如:
-
F# 提供了一个比 C#更简单的异步计算模型,称为 异步工作流。
-
C# 和 F# 都是强类型、多用途的编程语言,支持多种范式,包括函数式、命令式和 OOP 技术。
-
这两种语言都是 .NET 生态系统的一部分,并衍生出丰富的库,这些库可以被两种语言同等使用。
-
F# 是一种以函数式编程语言为先的语言,提供了巨大的生产力提升。实际上,用 F# 编写的程序往往更简洁,并且导致需要维护的代码更少。
-
F# 结合了函数式声明式编程风格的优点和命令式面向对象风格的支撑。这使得你可以使用现有的面向对象和命令式编程技能来开发应用程序。
-
由于默认的不可变构造函数,F# 有一系列内置的无锁数据结构。例如,有判别联合和记录类型。这些类型具有结构相等性,不允许
null,这有助于“信任”数据的完整性并简化比较。 -
与 C# 不同,F# 强烈反对使用
null值,也称为“十亿美元的错误”,相反,它鼓励使用不可变数据结构。这种对空引用的缺乏有助于减少编程中的错误数量。 -
F# 由于默认使用不可变作为类型构造函数,并且由于其 .NET 基础,它能够以最先进的实现能力与 C# 语言集成,因此自然可并行化。
-
C# 的设计倾向于命令式语言,首先是完全支持面向对象编程(OOP)。(我喜欢将其定义为命令式 OOP。)在过去的几年中,自从 .NET 3.5 发布以来,函数式范式通过添加 lambda 表达式和 LINQ(列表理解)等功能,已经影响了 C# 语言。
-
C# 也拥有强大的并发工具,这些工具可以帮助你轻松编写并行程序,并迅速解决复杂的现实世界问题。实际上,C# 语言在多核开发方面的卓越支持是灵活多变的,能够快速开发和原型化高度并行的对称多处理(SMP)应用程序。这些编程语言是编写并发软件的出色工具,当它们共存时,可用的解决方案的强大功能和选项会聚合。SMP 是通过多个共享相同操作系统和内存的处理器的程序处理。
-
F# 和 C# 可以互操作。实际上,一个 F# 函数可以调用 C# 库中的方法,反之亦然。
在接下来的章节中,我们将讨论替代的并发方法,如数据并行、异步和消息传递编程模型。我们将使用这些编程语言各自提供的最佳工具构建库,并将它们与其他语言进行比较。我们还将检查像 TPL 和反应式扩展(Rx)这样的工具和库,这些工具和库通过采用函数式范式进行设计、灵感和实现,以获得可组合的抽象。
显然,行业正在寻找一个可靠且简单的并发编程模型,这从软件公司投资于移除传统和复杂内存同步模型抽象级别的库中可以看出。这些高级库的例子包括英特尔的多线程构建块(TBB)和微软的任务并行库(TPL)。
还有有趣的开放源代码项目,例如 OpenMP(它提供了可以插入到程序中以使其部分并行的编译器特定定义的预处理器功能或实现定义信息,称为 pragma)和 OpenCL(一种与图形处理单元 [GPU] 通信的低级语言)。GPU 编程具有吸引力,并且得到了微软通过 C++ AMP 扩展和 Accelerator .NET 的认可。
摘要
-
并发和并行编程的挑战和复杂性没有银弹存在。作为一名专业工程师,你需要不同类型的弹药,并且你需要知道如何以及何时使用它们来击中目标。
-
程序设计时必须考虑到并发性;程序员不能继续编写顺序代码,而忽视并行编程的好处。
-
摩尔定律并没有错误。相反,它已经转向了每个处理器核心数量增加的方向,而不是单个 CPU 的速度增加。
-
在编写并发代码时,你必须牢记并发、多线程、多任务和并行之间的区别。
-
在并发环境中,可变状态和副作用的比例是首要关注的问题,因为它们会导致不希望出现的程序行为和错误。
-
为了避免编写并发应用程序的陷阱,你应该使用提高抽象级别的编程模型和工具。
-
函数范式为你提供了处理代码中并发性的正确工具和原则。
-
函数式编程在并行计算中表现出色,因为不可变性是默认的,这使得推理共享数据更容易。** **# 2
并发编程的功能性技术
本章涵盖
-
通过组合简单解决方案来解决复杂问题
-
使用闭包简化函数式编程
-
使用功能性技术提高程序性能
-
使用惰性评估
在函数式编程中编写代码可以让你感觉像是一名驾驶着快车的驾驶员,无需了解底层机械原理就能高速行驶。在第一章中,你了解到采用函数式编程方法来编写并发应用程序,比例如面向对象方法更好地解决了编写这些应用程序的挑战。任何函数式语言中的关键概念,如不可变变量和纯度,意味着虽然编写并发应用程序仍然远非易事,但开发者可以确信他们不会面临许多传统的并行编程陷阱。函数式编程的设计意味着诸如竞态条件和死锁等问题不会发生。
在本章中,我们将更详细地探讨主要的函数式编程原则,这些原则有助于我们编写高质量的并发应用程序。你将了解这些原则是什么,它们如何在 C#(尽可能)和 F#中工作,以及它们如何适应并行编程的模式。
在本章中,我假设你已经熟悉了函数式编程的基本原则。如果你不熟悉,请参阅附录 A 以获取你需要继续阅读的详细信息。到本章结束时,你将知道如何使用函数式技术将简单的函数组合起来解决复杂问题,并在多线程环境中安全地缓存和预计算数据以加快程序执行速度。
2.1 使用函数组合解决复杂问题
函数 组合是将函数以某种方式组合在一起,其中一个函数的输出成为下一个函数的输入,从而创建一个新的函数。这个过程可以无限进行,将函数链接在一起以创建强大的新函数来解决复杂问题。通过组合,你可以实现模块化,简化程序的结构。
函数式范式导致程序设计简单。函数组合背后的主要动机是提供一个简单的机制,用于构建易于理解、易于维护、可重用且简洁的代码。此外,无副作用的函数组合保持了代码的纯度,从而保留了并行逻辑。基本上,基于函数组合的并发程序比非函数组合的程序更容易设计且结构更简单。
函数组合使得将一系列简单的函数构建和粘合在一起成为一个单一的大而复杂的函数成为可能。为什么粘合代码很重要呢?想象一下以自上而下的方式解决问题。你从大问题开始,然后将其分解成更小的问题,直到最终足够小,可以直接解决问题。结果是,你得到了一系列小解决方案,然后你可以将它们粘合在一起来解决原始的更大问题。组合是将大解决方案拼接在一起的内聚力。
将函数组合视为管道化的概念,即一个函数的结果为后续函数提供第一个参数。这里有一些区别:
-
管道化执行一系列操作,其中每个函数的输入是前一个函数的输出。
-
函数组合返回一个新的函数,它是两个或更多函数的组合,并且不会立即调用(输入 -> 函数 -> 输出)。
2.1.1 C# 中的函数组合
C# 语言本身不支持函数组合,这造成了语义上的挑战。但可以通过直接的方式引入这种功能。考虑一个简单的 C# 例子(如列表 2.1 所示),使用 lambda 表达式定义两个函数。
列表 2.1 C# 中的 HOFs grindCoffee 和 brewCoffee 到 Espresso
Func<CoffeeBeans, CoffeeGround> grindCoffee = coffeeBeans
=> new CoffeeGround(coffeeBeans); ①
Func<CoffeeGround, Espresso> brewCoffee = coffeeGround
=> new Espresso(coffeeGround); ②
第一个函数 grindCoffee 接受一个 coffeeBeans 对象作为参数,并返回一个新的 CoffeeGround 实例。第二个函数 brewCoffee 接受一个 coffeeGround 对象作为参数,并返回一个新的 Espresso 实例。这些函数的目的是通过组合它们的评估结果来制作 Espresso。你如何组合这些函数?在 C# 中,你可以选择连续执行这些函数,将第一个函数的结果作为链传递给第二个函数。
列表 2.2 C# 中的组合函数(不良)
CoffeeGround coffeeGround = grindCoffee(coffeeBeans);
Espresso espresso = brewCoffee(coffeeGround);
Espresso espresso = brewCoffee(grindCoffee(coffeeBeans)); ①
首先,执行函数 grindCoffee,传递参数 coffeeBeans,然后将结果 coffeeGround 传递给函数 brewCoffee。第二个等效的选项是将 grindCoffee 和 brewCoffee 的执行连接起来,这实现了函数组合的基本思想。但从可读性的角度来看,这是一个不好的模式,因为它迫使你从右到左阅读代码,这不是阅读英语的自然方式。最好是从左到右逻辑地阅读代码。
一个更好的解决方案是创建一个通用的、专门的扩展方法,可以用来组合任何两个具有一个或多个泛型输入参数的函数。以下列表定义了一个 Compose 函数,并重构了之前的例子。(泛型参数用粗体表示。)
列表 2.3 Compose 函数在 C# 中
static Func<A, C> Compose<A, B, C>(this Func<A, B> f, Func<B, C> g)
=> (n) => **g(f(n))**; ①
Func<CoffeeBeans, Espresso> makeEspresso = ➥ grindCoffee.Compose(brewCoffee); ②
Espresso espresso = makeEspresso(coffeBeans);
如图 2.1 所示,高阶函数 Compose 将函数 grindCoffee 和 brewCoffee 连接起来,创建一个新的函数 makeEspresso,它接受一个参数 coffeeBeans 并执行 brewCoffee (grindCoffee(coffeeBeans))。

图 2.1 从函数 Func<CoffeeBeans, CoffeeGround> grindCoffee 到函数 Func<CoffeeGround, Espresso> brewCoffee 的函数组合。因为 grindCoffee 函数的输出与 brewCoffee 函数的输入相匹配,所以这些函数可以在一个新的函数中组合,该函数将输入 CoffeeBeans 映射到输出 Espresso。
在函数体中,你可以轻松地看到看起来与 lambda 表达式 makeEspresso 完全相同的行。这种扩展方法封装了函数组合的概念。其思路是创建一个函数,该函数返回将内部函数 grindCoffee 的结果应用于外部函数 brewCoffee 的结果。这在数学中是一个常见的模式,可以用 brewCoffee 的 grindCoffee 表示法来表示,意味着 grindCoffee 应用到 brewCoffee。使用扩展方法来提高抽象级别,创建可重用和模块化的函数(HOFs)是很容易的^(1)。
在 F# 等语言中内置组合语义有助于以声明性方式结构化代码。遗憾的是,在 C# 中没有类似复杂的解决方案。在这本书的源代码中,你可以找到一个包含多个 Compose 扩展方法重载的库,这些方法可以提供类似的有用和可重用解决方案。
2.1.2 F# 中的函数组合
F# 内置了对函数组合的支持。实际上,compose 函数的定义是用 >> 中缀操作符内置到语言中的。在 F# 中使用此操作符,你可以组合现有函数来构建新的函数。
让我们考虑一个简单的场景,你想要将列表中的每个元素增加 4 然后乘以 3。以下列表显示了如何使用和未使用函数组合来构建此函数,以便你可以比较两种方法。
列表 2.4 F# 对函数组合的支持
let add4 x = x + 4 ①
let multiplyBy3 x = x * 3 ②
let list = [0..10] ③
let newList = List.map(fun x -> ➥ multiplyBy3(add4(x))) list ④
let newList = list |> ➥ List.map(add4 >> multiplyBy3) ⑤
示例代码使用 map 函数将 add4 和 multiplyBy3 函数应用于列表中的每个元素,map 函数是 F# 中 List 模块的一部分。List.map 等同于 LINQ 中的 Select 静态方法。这两个函数的组合是通过一种强制代码从内向外读取的顺序语义方法来实现的:multiplyBy3(add4(x))。使用 >> 中缀操作符的函数组合风格允许代码从左到右读取,就像教科书一样,结果更加精致、简洁,且易于理解。
实现具有简单和模块化代码语义的功能组合的另一种方法是使用一种称为闭包的技术。
2.2 使用闭包简化函数式思维
闭包 的目的是简化函数式思维,并允许运行时管理状态,为开发者释放额外的复杂性。闭包是一个一等函数,具有绑定在词法环境中的自由变量。在这些术语背后隐藏着一个简单的概念:闭包是提供函数访问局部状态和将数据传递到后台操作的一种更方便的方式。它们是特殊的函数,具有对所引用的所有非局部变量(也称为 自由变量 或 上值)的隐式绑定。此外,闭包允许函数在调用其直接词法作用域之外时访问一个或多个非局部变量,并且这个特殊函数的主体可以将这些 自由变量 作为单个实体传输,这些变量在其封装作用域中定义。更重要的是,闭包封装行为,就像任何其他对象一样传递它,授予访问闭包创建、读取和更新这些值的上下文。
在函数式编程(FP)或任何支持高阶函数的其他编程语言中,如果没有闭包的支持,数据的范围可能会造成问题和不利因素。然而,在 C# 和 F# 的情况下,编译器使用闭包来增加和扩展变量的作用域。因此,数据在当前上下文中是可访问和可见的,如图 2.2 所示。图 2.2。

图 2.2 在这个使用闭包的例子中,外部函数 Increment 的局部变量 X 以由内部函数生成的函数(Func<int>)的形式暴露出来。重要的是函数 Increment 的返回类型,它是一个捕获封装变量 X 的函数,而不是变量本身。每次函数引用 incr 运行时,捕获的变量 X 的值都会增加。
在 C# 中,自 .NET 2.0 以来就提供了闭包功能;但是,自从引入了 lambda 表达式和 .NET 中的匿名方法之后,闭包的使用和定义变得更加容易,这形成了一种和谐的混合。
本节使用 C# 作为代码示例,尽管相同的概念和技术也适用于 F#。此列表使用匿名方法定义了一个闭包。
列表 2.5 使用匿名方法在 C# 中定义的闭包
string freeVariable = "I am a free variable"; ①
Func<string, string> lambda = value => freeVariable + " " + value; ②
在这个例子中,匿名函数 lambda 引用了其封装作用域中的自由变量 freeVariable。闭包使函数能够访问其周围的状态(在这种情况下,freeVariable),从而提供更清晰、更易读的代码。在没有闭包的情况下复制相同的功能可能意味着创建一个你希望函数使用的类(并且该类了解局部变量),并将该类作为参数传递。在这里,闭包帮助运行时管理状态,避免了创建用于管理状态的额外且不必要的样板代码。这是闭包的一个好处:它可以作为一个可移植的执行机制,用于将额外上下文传递到高阶函数(HOFs)中。不出所料,闭包通常与 LINQ 结合使用。你应该将闭包视为 lambda 表达式的积极副作用,以及你工具箱中的一项伟大编程技巧。
2.2.1 使用 lambda 表达式在闭包中捕获变量
当相同的变量即使在它本应超出作用域的情况下也可以使用时,闭包的力量就显现出来了。因为变量已经被捕获,所以它不会被垃圾回收。使用闭包的优势在于你可以有一个方法级别的变量,这通常用于实现内存缓存技术以改善计算性能。本章后面将讨论这些功能技术记忆化和函数式预计算。
列表 2.6 使用事件编程模型(EPM)下载一个图像,异步展示了捕获变量如何与闭包一起工作。当下载完成时,进程继续更新客户端应用程序的用户界面。实现使用异步语义 API 调用。当请求完成时,注册的事件 DownloadDataCompleted 触发并执行剩余的逻辑。
列表 2.6 使用 lambda 表达式捕获局部变量的事件寄存器
void UpdateImage(string url)
{
System.Windows.Controls.Image image = img; ①
var client = new WebClient();
client.DownloadDataCompleted += (o, e) => ②
{
if (image != null)
using (var ms = new MemoryStream(e.Result))
{
var imageConverter = new ImageSourceConverter();
image.Source = (ImageSource)
➥ imageConverter.ConvertFrom(ms);
}
};
client.DownloadDataAsync(new Uri(url)); ③
}
首先,你获取名为 img 的图像控制器的引用。然后,你使用 lambda 表达式注册处理程序回调,以便在 DownloadDataAsync 完成时处理 DownloadDataCompleted 事件。在 lambda 块内部,由于闭包,代码可以直接访问作用域之外的状态。这种访问允许你检查图像指针的状态,如果它不是 null,则更新用户界面。
这是一个相当直接的过程,但时间线流程增加了有趣的行为。该方法异步执行,因此当数据从服务返回并回调更新 image 时,方法已经完成。
如果方法完成,局部变量image是否应该超出作用域?那么图像如何更新?答案被称为 捕获变量。lambda 表达式捕获局部变量 image,因此即使通常会被释放,它仍然保持作用域。从这个例子中,你应该将捕获变量视为闭包创建时变量值的快照。如果你在没有这个捕获变量的相同过程中构建相同的过程,你需要一个类级变量来保存图像值。**
为了证明这一点,让我们分析如果在 列表 2.6 的末尾添加一行代码,将图像引用更改为 null 指针(加粗)会发生什么。
列表 2.7 证明捕获变量的时间
void UpdateImage(string url)
{
System.Windows.Controls.Image image = img;
var client = new WebClient();
client.DownloadDataCompleted += (o, e) =>
{
if (image != null) {
using (var ms = new MemoryStream(e.Result))
{
var imageConverter = new ImageSourceConverter();
image.Source = (ImageSource)
➥ imageConverter.ConvertFrom(ms);
}
}
};
client.DownloadDataAsync(new Uri(url));
**image = null****;** ①
}
通过运行经过修改的程序,UI 中的图像不会更新,因为在执行 lambda 表达式主体之前,指针被设置为 null。尽管在捕获时图像有一个值,但在代码执行时它是 null。捕获变量的生命周期延长,直到所有引用变量的闭包都适合进行垃圾回收。
在 F# 中,不存在 null 对象的概念,因此不可能运行这样的不良场景。
2.2.2 多线程环境中的闭包
让我们分析一个使用闭包向通常在主线程之外运行的任务提供数据的用例场景。在 FP 中,闭包通常用于管理可变状态,以限制和隔离可变结构的范围,允许线程安全访问。这非常适合多线程环境。
在 列表 2.8 中,一个 lambda 表达式从 TPL 的一个新 Task(System.Threading.Tasks.Task)中调用 Console.WriteLine 方法。当这个任务开始时,lambda 表达式构建一个闭包,该闭包封装了作为另一个线程中运行的方法的参数传递的局部变量 iteration。在这种情况下,编译器会自动生成一个匿名类,该变量作为公开属性。
列表 2.8 在多线程环境中捕获闭包变量
for (int iteration = 1; iteration < 10; iteration++)
{
Task.Factory.StartNew(() => Console.WriteLine("{0} - {1}",
➥ Thread.CurrentThread.ManagedThreadId, iteration));
}
闭包可能导致奇怪的行为。从理论上讲,这个程序应该可以工作:你期望程序打印出从 1 到 10 的数字。但在实践中,情况并非如此;程序将打印数字 10 十次,因为你使用了多个 lambda 表达式中相同的变量,这些匿名函数共享变量值。
让我们分析另一个例子。在这个列表中,你使用 lambda 表达式将数据传递到两个不同的线程中。
列表 2.9 在多线程代码中使用闭包的奇怪行为
Action<int> displayNumber = n => Console.WriteLine(n);
int i = 5;
Task taskOne = Task.Factory.StartNew(() => displayNumber(i));
i = 7;
Task taskTwo = Task.Factory.StartNew(() => displayNumber(i));
Task.WaitAll(taskOne, taskTwo);
即使第一个 lambda 表达式在变量值改变之前捕获了变量 i,两个线程也会打印数字 7,因为变量 i 在两个线程开始之前已经被改变。这个微妙问题的原因是 C# 的可变性质。当一个闭包通过 lambda 表达式捕获一个可变变量时,lambda 表达式捕获的是变量的引用而不是该变量的当前值。因此,如果任务在变量的引用值改变之后运行,那么值将是内存中的最新值,而不是变量被捕获时的值。
这就是为什么选择其他解决方案而不是手动编写并行循环的原因之一。TPL 中的 Parallel.For 解决了这个错误。在 C# 中,一个可能的解决方案是为每个 Task 创建并捕获一个新的临时变量。这样,新变量的声明就被分配在新的堆位置,保留了原始值。这种复杂而巧妙的行为在函数式语言中并不适用。让我们看看使用 F# 的类似场景。
列表 2.10 F# 中多线程环境中捕获变量的闭包
let tasks = Array.zeroCreate<Task> 10
for index = 1 to 10 do
tasks.[index - 1] <- Task.Factory.StartNew(fun () ->
➥ Console.WriteLine index)
运行这个版本的代码,结果正如预期:程序打印了数字 1 到 10。解释是 F# 处理它的过程式 for 循环的方式与 C# 不同。F# 编译器为每次迭代创建一个新的不可变值,并在内存中具有不同的位置,而不是使用可变变量并在每次迭代中更新其值。这种偏好不可变类型的函数式行为的结果是 lambda 捕获了对一个永远不会改变的不可变值的引用。
多线程环境通常使用闭包,因为捕获和在不同上下文中传递变量很简单,这需要额外的思考。以下列表说明了 .NET TPL 库如何使用闭包通过 Parallel.Invoke API 执行多个线程。
列表 2.11 多线程环境中捕获变量的闭包
public void ProcessImage(Bitmap image) {
byte[] array = image.ToByteArray(ImageFormat.Bmp); ①
Parallel.Invoke(
() => ProcessArray(array, 0, array.Length / 2),
() => ProcessArray(array, array.Length / 2, array.Length)); ②
}
在示例中,Parallel.Invoke 生成了两个独立任务,每个任务都会运行 ProcessArray 方法,针对 array 的一个部分,该部分的变量被 lambda 表达式捕获并封装。
在任务并行化的上下文中,请注意闭包中捕获的变量:因为闭包捕获的是变量的引用而不是其实际值,你可能会无意中共享一些不明显的内容。闭包是一种强大的技术,你可以用它来实现模式,以提高你程序的性能。
2.3 用于程序加速的备忘录缓存技术
缓存技术,也称为表格技术,是一种旨在提高应用程序性能的 FP 技术。通过缓存函数的结果,避免了由于重复相同的计算而产生的额外不必要的计算开销,从而实现了程序速度的提升。这是可能的,因为缓存技术通过存储具有相同参数的前期计算结果(如图 2.3 所示)来绕过昂贵的函数调用执行,以便在参数再次出现时检索。缓存函数将计算结果保留在内存中,以便在未来的调用中立即返回。

图 2.3 缓存技术是一种缓存函数值的技巧,确保只进行一次评估。当输入值传递给缓存函数时,内部表存储会验证是否存在与该输入关联的结果,以便立即返回。否则,函数初始化器将运行计算,然后更新内部表存储并返回结果。下次相同的输入值传递给缓存函数时,表存储中包含关联的结果,计算将被跳过。
这个概念一开始可能听起来很复杂,但一旦应用起来就是一个简单的技术。缓存技术使用闭包来促进函数转换为便于访问局部变量的数据结构。闭包被用作缓存函数每次调用的包装器。这个局部变量,通常是一个查找表,目的是将内部函数的结果作为值存储,并使用传递给此函数的参数作为键引用。
缓存技术非常适合多线程环境,可以提供巨大的性能提升。主要好处在于当一个函数被反复应用于相同的参数时;但是,从 CPU 计算的角度来看,运行函数的成本比访问相应的数据结构要高。例如,为了给图像应用颜色过滤器,并行运行多个线程是一个好主意。每个线程访问图像的一部分并修改上下文中的像素。但是,可能存在将过滤器颜色应用于具有相同值的像素集的情况。在这种情况下,如果计算将得到相同的结果,为什么还要重新评估呢?相反,可以使用缓存技术将结果缓存起来,这样线程就可以跳过不必要的任务,更快地完成图像处理。
以下列表展示了 C#中缓存函数的基本实现。
列表 2.12 说明缓存技术工作原理的简单示例
static Func<T, R> Memoize<T, R>(Func<T, R> func) ①
where T : IComparable
{
Dictionary<T, R> cache = new Dictionary<T, R>(); ②
return arg => { ③
if (cache.ContainsKey(arg)) ④
return cache[arg]; ⑤
return (cache[arg] = func(arg)); ⑥
};
}
首先,你定义 Memoize 函数,该函数内部使用泛型集合 Dictionary 作为缓存表变量的表。闭包捕获局部变量,以便可以从指向闭包的委托和外部函数中访问它。当 HOF 被调用时,它首先尝试将输入与函数匹配以验证参数是否已经被缓存。如果参数键存在,缓存表返回结果。如果参数键不存在,第一步是使用参数评估函数,将参数和相关的结果添加到缓存表中,并最终返回结果。重要的是要提到,memoization 是一个 HOF,因为它接受一个函数作为输入并返回一个函数作为输出。
这是在 F# 中实现的等效 memoize 函数。
列表 2.13 F# 中的 memoize 函数
let memoize func =
let table = Dictionary<_,_>()
fun x -> if table.ContainsKey(x) then table.[x]
else
let result = func x
table.[x] <- result
result
这是一个使用之前定义的 memoize 函数的简单示例。在 列表 2.14 中,Greeting 函数返回一个字符串,其中包含传递给参数的欢迎消息。消息还包括函数被调用时的时间,这用于在函数运行时跟踪时间。代码为了演示目的,在每次调用之间应用了 2 秒的延迟。
列表 2.14 C# 中的问候示例
public static string Greeting(string name)
{
return $"Warm greetings {name}, the time is
➥ {DateTime.Now.ToString("hh:mm:ss")}";
}
Console.WriteLine(Greeting ("Richard"));
System.Threading.Thread.Sleep(2000);
Console.WriteLine(Greeting ("Paul"));
System.Threading.Thread.Sleep(2000);
Console.WriteLine(Greeting ("Richard"));
// output
Warm greetings Richard, the time is 10:55:34
Warm greetings Paul, the time is 10:55:36
Warm greetings Richard, the time is 10:55:38
接下来,代码重新执行相同的消息,但使用 Greeting 函数的 memoized 版本。
列表 2.15 使用 memoized 函数的问候示例
var greetingMemoize = Memoize<string, string>(Greeting); ①
Console.WriteLine(greetingMemoize ("Richard"));
System.Threading.Thread.Sleep(2000);
Console.WriteLine(greetingMemoize ("Paul"));
System.Threading.Thread.Sleep(2000);
Console.WriteLine(greetingMemoize("Richard"));
// output
Warm greetings Richard, the time is 10:57:21 ②
Warm greetings Paul, the time is 10:57:23
Warm greetings Richard, the time is 10:57:21 ②
输出表明前两次调用发生在不同的时间,正如预期的那样。但在第三次调用中发生了什么?为什么第三次函数调用返回与第一次完全相同时间的消息?答案是 memoization。
第一次和第三次函数调用 greetingMemoize("Richard") 有相同的参数,并且它们的结果在 greetingMemoize 函数的初始调用中只被缓存了一次。第三次函数调用的结果不是其执行的效果,而是具有相同参数的函数存储的结果,因此时间匹配。
这就是 memoization 的工作原理。memoized 函数的职责是在内部表中查找传入的参数。如果找到输入值,它将返回之前计算的结果。否则,函数将结果存储在表中。
2.4 使用 memoize 快速网络爬虫的实际应用
现在,你将使用上一节中学到的知识实现一个更有趣的示例。对于这个示例,你将构建一个网络爬虫,它从每个访问的网站中提取并打印到控制台中的页面标题。列表 2.16 运行的是没有 memoization 的代码。然后你将使用 memoization 技术重新执行相同的程序,并比较结果。最终,你将下载多个网站的 内容,结合并行执行和 memoization。
列表 2.16 C# 中的网络爬虫
public static IEnumerable<string> WebCrawler(string url) { ①
string content = GetWebContent(url);
yield return content;
foreach (string item in AnalyzeHtmlContent(content))
yield return GetWebContent(item);
}
static string GetWebContent(string url) { ②
using (var wc = new WebClient())
return wc.DownloadString(new Uri(url));
}
static readonly Regex regexLink =
new Regex(@"(?<=href=('|""))https?://.*?(?=\1)");
static IEnumerable<string> AnalyzeHtmlContent(string text) { ③
foreach (var url in regexLink.Matches(text))
yield return url.ToString();
}
static readonly Regex regexTitle =
new Regex("<title>(?<title>.*?)<\\/title>", RegexOptions.Compiled);
static string ExtractWebPageTitle(string textPage) { ④
if (regexTitle.IsMatch(textPage))
return regexTitle.Match(textPage).Groups["title"].Value;
return "No Page Title Found!";
}
WebCrawler 函数通过调用 GetWebContent 方法下载作为参数传递的网页 URL 的内容。接下来,它分析下载的内容并提取网页中包含的超链接,这些超链接被发送回初始函数进行处理,对每个超链接重复这些操作。下面是网络爬虫的实际运行情况。
列表 2.17 执行网络爬虫
List<string> urls = new List<string> { ①
@"http://www.google.com",
@"http://www.microsoft.com",
@"http://www.bing.com",
@"http://www.google.com"
};
var webPageTitles = from url in urls ②
from pageContent in WebCrawler(url)
select ExtractWebPageTitle(pageContent);
foreach (var webPageTitle in webPageTitles)
Console.WriteLine(webPageTitle);
// OUTPUT
Starting Web Crawler for http://www.google.com...
Google
Google Images
...
Web Crawler completed for http://www.google.com in 5759ms
Starting Web Crawler for http://www.microsoft.com...
Microsoft Corporation
Microsoft - Official Home Page
Web Crawler completed for http://www.microsoft.com in 412ms
Starting Web Crawler for http://www.bing.com...
Bing
Msn
...
Web Crawler completed for http://www.bing.com in 6203ms
Starting Web Crawler for http://www.google.com...
Google
Google Images
...
Web Crawler completed for http://www.google.com in 5814ms
你正在使用 LINQ(语言集成查询)对一组给定的 URL 运行网络爬虫。当查询表达式在 foreach 循环中实现时,ExtractWebPageTitle 函数从每个页面的内容中提取页面标题并将其打印到控制台。由于操作的跨网络性质,GetWebContent 函数需要时间来完成下载。前一个代码实现的一个问题是存在重复的超链接。通常,网页会有重复的超链接,在这个例子中导致冗余和不必要的下载。更好的解决方案是缓存 WebCrawler 函数。
列表 2.18 使用缓存执行网络爬虫
static Func<string, IEnumerable<string>> WebCrawlerMemoized =
➥Memoize<string, IEnumerable<string>>(WebCrawler); ①
var webPageTitles = from url in urls ②
from pageContent in WebCrawlerMemoized(url)
select ExtractWebPageTitle(pageContent);
foreach (var webPageTitle in webPageTitles)
Console.WriteLine(webPageTitle);
// OUTPUT
Starting Web Crawler for http://www.google.com...
Google
Google Images
...
Web Crawler completed for http://www.google.com in 5801ms
Starting Web Crawler for http://www.microsoft.com...
Microsoft Corporation
Microsoft - Official Home Page
Web Crawler completed for http://www.microsoft.com in 4398ms
Starting Web Crawler for http://www.bing.com...
Bing
Msn
...
Web Crawler completed for http://www.bing.com in 6171ms
Starting Web Crawler for http://www.google.com...
Google
Google Images
...
Web Crawler completed for http://www.google.com in 02ms
在这个例子中,你实现了 WebCrawlerMemoized 高阶函数,它是 WebCrawler 函数的缓存版本。输出确认了缓存版本的代码运行速度更快。实际上,从网页 www.google.com 提取内容第二次只用了 2 毫秒,而没有缓存则需要超过 5 秒钟。
进一步的改进应涉及并行下载网页。幸运的是,因为你使用了 LINQ 处理查询,所以只需要微小的代码更改就可以使用多线程。自 .NET 4.0 框架问世以来,LINQ 有一个扩展方法 AsParallel(),它能够启用 LINQ 的并行版本(或 PLINQ)。PLINQ 的本质是处理数据并行性;这两个主题将在第四章中介绍。
LINQ 和 PLINQ 是使用函数式编程概念设计和实现的,特别强调声明式编程风格。这是可行的,因为函数式范式与其他程序范式相比,往往能提高抽象级别。抽象允许编写代码时无需了解底层库的实现细节,正如这里所示。
列表 2.19 使用 PLINQ 的网络爬虫查询
var webPageTitles = from url in urls.AsParallel() ①
from pageContent in WebCrawlerMemoized(url)
select ExtractWebPageTitle(pageContent);
PLINQ 易于使用,并且可以带来实质性的性能提升。尽管我们只展示了 AsParallel 扩展方法,但它的内容远不止于此。
在运行程序之前,你还有一个重构需要应用——缓存。因为它们必须对所有线程可访问,所以缓存往往被设置为静态。随着并行性的引入,多个线程可以同时访问备忘录函数,这可能导致由于暴露的底层可变数据结构而引起的竞争条件问题。竞争条件问题在上一章中已有讨论。幸运的是,这是一个简单的修复,如本列表所示。
列表 2.20 线程安全的备忘录函数
public Func<T, R> MemoizeThreadSafe<T, R>(Func<T, R> func)
where T : IComparable
{
ConcurrentDictionary<T, R> cache = new ConcurrentDictionary<T, R>(); ①
return arg => cache.GetOrAdd(arg, a => func(a));
}
public Func<string, IEnumerable<string>> WebCrawlerMemoizedThreadSafe =
MemoizeThreadSafe<string, IEnumerable<string>>(WebCrawler);
var webPageTitles =
from url in urls.AsParallel()
from pageContent in WebCrawlerMemoizedThreadSafe(url) ②
select ExtractWebPageTitle(pageContent);
快速答案是替换当前的 Dictionary 集合为等效的线程安全版本 ConcurrentDictionary。这个重构有趣地需要更少的代码。接下来,你实现一个线程安全的备忘录版本的函数 GetWebContent,该函数用于 LINQ 表达式。现在你可以并行运行网络爬虫。为了处理示例中的页面,双核机器可以在不到 7 秒内完成分析,而初始实现需要 18 秒。升级后的代码不仅运行更快,还减少了网络 I/O 操作。
2.5 更好的性能的延迟备忘录
在前面的示例中,网络爬虫允许多个并发线程以最小的开销访问备忘录函数,但它不强制执行函数初始化器 func(a) 在评估表达式时对相同值执行多次。这似乎是一个小问题,但在高度并发的应用程序中,这种情况会成倍增加(特别是如果对象初始化成本高昂)。解决方案是向缓存添加一个未初始化的对象,而是一个按需初始化项的函数。你可以将函数初始化器的结果值包装在一个 Lazy 类型中(如 列表 2.21 中用粗体突出显示)。该列表显示了备忘录解决方案,它在线程安全和性能方面代表了一个完美的设计,同时避免了重复的缓存项初始化。
列表 2.21 线程安全的延迟评估备忘录函数
static Func<T, R> MemoizeLazyThreadSafe<T, R>(Func<T, R> func)
where T : IComparable
{
ConcurrentDictionary<T, **Lazy<R>**> cache =
➥ new ConcurrentDictionary<T, **Lazy<R>**>(); ①
return arg => cache.GetOrAdd(arg, a =>
➥ new Lazy<R>(() => func(a))).Value;
}
根据微软的文档,GetOrAdd 方法不会阻止函数 func 对于相同的给定参数被多次调用,但它确实保证只将“函数评估的结果”添加到集合中。例如,在缓存值添加之前,可能有多个线程同时检查缓存。此外,没有方法可以强制函数 func(a) 是线程安全的。没有这个保证,在多线程环境中,多个线程可能同时访问同一个函数——这意味着 func(a) 也应该是线程安全的。提出的解决方案是避免使用原始锁,而是在 .NET 4.0 中使用 Lazy<T> 构造。这个解决方案提供了对函数 func 实现的完全线程安全保证,并确保函数只被评估一次。
2.5.1 函数记忆化的注意事项
在前面的代码示例中引入的记忆化实现是一种相当天真方法。将数据存储在简单字典中的解决方案是可行的,但它不是长期解决方案。字典是无界的;因此,项目永远不会从内存中移除,只会添加,这可能在某个时候导致内存泄漏问题。存在解决所有这些问题的方法。一个选项是实现一个使用 WeakReference 类型存储结果值的记忆化函数,这允许在垃圾回收器(GC)运行时收集结果。自从 .NET 4.0 框架引入了 ConditionalWeakDictionary 收集以来,这种实现变得简单:字典使用一个作为弱引用持有的类型实例作为键。关联的值只要键存在就保持。当键被 GC 回收以进行合并时,对数据的引用被移除,使其可用于收集。
弱引用是处理对托管对象引用的有价值机制。典型的对象引用(也称为强引用)具有确定性行为,只要你有对象的引用,垃圾回收器(GC)就不会收集该对象,从而使其保持存活状态。但在某些场景下,你希望在不干扰 GC 回收该对象内存能力的情况下,将一个不可见的字符串附加到对象上。如果 GC 回收了内存,你的字符串就会变得无关联,你可以检测到这一点。如果 GC 尚未接触该对象,你可以拉出字符串,并检索到对象的强引用以再次使用。这种功能对于自动管理缓存非常有用,它可以保持对最近最少使用对象的弱引用,同时防止它们被回收,从而不可避免地优化内存资源。
另一个选择是使用缓存过期策略,通过将时间戳存储在每个结果中,指示项目持久化的时间。在这种情况下,你必须定义一个常数时间来使项目无效。当时间到期时,项目将从集合中删除。本书的可下载源代码包含这两种实现。
2.6 有效并发推测以分摊昂贵计算的成本
推测性 处理(预计算)是利用并发的良好理由。推测性处理是一种函数式编程(FP)模式,其中在算法实际运行之前,以及一旦函数的所有输入都可用时,执行计算。并发推测背后的想法是分摊昂贵计算的成本,并提高程序的性能和响应速度。这种技术在并行计算中很容易应用,可以使用多核硬件来预计算多个操作,从而启动并发运行的任务,并使数据准备好读取而无需延迟。
假设你被给了一个长的输入单词列表,并且你想计算一个函数,该函数可以找到列表中单词的最佳模糊匹配^(2)。对于模糊匹配算法,你将应用Jaro-Winkler 距离,该距离衡量两个字符串之间的相似性。我们不会在这里介绍该算法的实现。你可以在在线源代码中找到完整的实现。
此列表显示了使用 Jaro-Winkler 算法实现模糊匹配函数(如粗体所示)。
列表 2.22 在 C#中实现模糊匹配
public static string FuzzyMatch(List<string> words, string word)
{
var wordSet = new HashSet<string>(words); ①
string bestMatch =
(from w in wordSet.**AsParallel**() ②
select JaroWinklerModule.Match(w, word))
.OrderByDescending(w => w.Distance)
.Select(w => w.Word)
.FirstOrDefault();
return bestMatch; ③
}
函数FuzzyMatch使用 PLINQ 并行计算传递给函数的单词与另一个字符串数组之间的模糊匹配。结果是匹配的HashSet集合,然后按最佳匹配顺序排列,以返回列表中的第一个值。"HashSet"是一种高效的数据结构,用于查找。
逻辑类似于查找。因为List<string> words可能包含重复项,所以函数首先实例化一个更有效的数据结构。然后函数利用这个数据结构来运行实际的模糊匹配。这种实现并不高效,因为设计问题很明显:"FuzzyMatch"每次调用时都应用于其两个参数。每次执行"FuzzyMatch"时都会重建内部表结构,浪费了任何积极的效果。
你如何提高效率?通过应用部分函数应用或部分应用以及来自函数式编程(FP)的备忘录技术,你可以实现预计算。有关部分应用的更多详细信息,请参阅附录 A。预计算的概念与备忘录紧密相关,在这种情况下,它使用包含预计算值的表格。下面的列表显示了实现一个更快的模糊匹配函数(如粗体所示)。
列表 2.23 使用预计算进行快速模糊匹配
static Func<string, string> PartialFuzzyMatch(List<string> words) ①
{
var wordSet = new HashSet<string>(words); ②
return word =>
(from w in wordSet.**AsParallel**()
select JaroWinklerModule.Match(w, word))
.OrderByDescending(w => w.Distance)
.Select(w => w.Word)
.FirstOrDefault(); ③
}
Func<string, string> fastFuzzyMatch = ➥ PartialFuzzyMatch(words); ④
string magicFuzzyMatch = fastFuzzyMatch("magic");
string lightFuzzyMatch = fastFuzzyMatch("light"); ⑤
首先,你创建了一个函数 PartialFuzzyMatch 的偏应用版本。这个新函数只接受 List<string> words 作为参数,并返回一个新的函数来处理第二个参数。这是一个巧妙的策略,因为它通过预计算高效的查找结构,立即消耗第一个参数。
有趣的是,编译器使用闭包来存储数据结构,该数据结构可以通过函数返回的 lambda 表达式访问。lambda 表达式是提供预计算状态给函数的一种特别方便的方式。然后,你可以通过提供参数 List<string> words 来定义 fastFuzzyMatch 函数,该参数用于准备底层查找表,从而实现更快的计算。在提供 List<string> words 之后,fastFuzzyMatch 返回一个接受字符串参数 word 的函数,但立即计算用于查找的 HashSet。
通过这些更改,与字符串 magic 和 light 进行模糊匹配时的处理时间比按需计算这些值时减少了半。
2.6.1 使用自然函数支持进行预计算
现在让我们看看使用函数式语言 F# 的相同模糊匹配实现。列表 2.24 显示了一个略有不同的实现,这是由于 F# 的内在函数语义(AsParallel 方法以粗体突出显示)。
列表 2.24 在 F# 中实现快速模糊匹配
let fuzzyMatch (words:string list) =
let wordSet = new HashSet<string>(words) ①
let partialFuzzyMatch word = ②
query { for w in wordSet.**AsParallel**() do
select (JaroWinkler.getMatch w word) }
|> Seq.sortBy(fun x -> -x.Distance)
|> Seq.head
fun word -> partialFuzzyMatch word ③
let fastFuzzyMatch = fuzzyMatch words ④
let magicFuzzyMatch = fastFuzzyMatch "magic"
let lightFuzzyMatch = fastFuzzyMatch "light”" ⑤
fuzzyMatch 的实现迫使 F# 运行时在每次调用时生成内部字符串集合。相反,偏应用函数 fastFuzzyMatch 只初始化一次内部集合,并重用于所有后续调用。预计算是一种缓存技术,它执行初始计算以创建,在这种情况下,一个 HashSet<string>,以便可以访问。
F# 实现使用查询表达式来查询和转换数据。这种方法允许你使用与 列表 2.23 中等效的 C# 中的 PLINQ。但在 F# 中,有一个更函数式的风格来并行化序列上的操作——采用并行序列 (PSeq)。使用此模块,可以将 fuzzyMatch 函数重写为组合形式:
let fuzzyMatch (words:string list) =
let wordSet = new HashSet<string>(words)
fun word ->
wordSet
|> PSeq.map(fun w -> JaroWinkler.getMatch w word)
|> PSeq.sortBy(fun x -> -x.Distance)
|> Seq.head
fuzzyMatch 在 C# 和 F# 中的代码实现是等效的,但前者作为默认值是柯里化的。这使得使用偏应用进行重构变得更容易。在前面代码片段中使用的 F# 并行序列 PSeq 在第五章中介绍。
通过查看 fuzzyMatch 签名类型,可以更清晰地理解:
`string set -> (string -> string)`
签名函数读取为一个接受字符串集合作为参数的函数,返回一个接受字符串作为参数的函数,然后返回字符串作为返回类型。这个函数链允许你在不思考的情况下利用偏应用策略。
2.6.2 让最佳计算获胜
另一个推测性评估的例子是受到由 Conal Elliott (3) 创建的不明确的选项操作符^的启发,他为其函数式响应式编程(FRP)实现(conal.net)。这个操作符背后的想法很简单:它是一个接受两个参数并并发评估它们的函数,返回第一个可用的结果。
这个概念可以扩展到两个以上的并行函数。想象一下,你正在使用多个天气服务来检查一个城市的温度。你可以同时启动单独的任务来查询每个服务,在最快任务返回后,你不需要等待其他任务完成。函数等待最快任务返回并取消剩余的任务。以下列表展示了没有错误处理支持的一个简单实现。
列表 2.25 实现最快的天气任务
public Temperature SpeculativeTempCityQuery(string city,
➥ params Uri[] weatherServices)
{
var cts = new CancellationTokenSource(); ①
var tasks =
(from uri in weatherServices
select Task.Factory.StartNew<Temperature>(() =>
queryService(uri, city), cts.Token)).ToArray(); ②
int taskIndex = Task.WaitAny(tasks); ③
Temperature tempCity = tasks[taskIndex].Result;
cts.Cancel(); ④
return tempCity;
}
预计算是实现任何类型函数和服务的关键技术,从简单到复杂,再到更高级的计算引擎。推测性评估旨在消耗那些本将闲置的 CPU 资源。这是一种在任何程序中都方便的技术,并且可以在支持闭包捕获和暴露这些部分值的任何语言中实现。
2.7 懒惰是好事
并发中的一个常见问题是能够以线程安全的方式正确初始化一个共享对象。当对象具有昂贵且耗时的结构时,这种需求变得更加突出,以提高应用程序的启动时间。
惰性评估是一种编程技术,用于将表达式的评估推迟到最后一刻,即它被访问时。信不信由你,懒惰可以导致成功——在这种情况下,它是你的工具箱中的必备工具。有些反直觉,惰性评估的力量使程序运行得更快,因为它只提供查询结果所需的,防止过度计算。想象一下编写一个程序,它执行不同的长时间运行的操作,可能分析大量数据以生成各种报告。如果这些操作同时评估,系统可能会遇到性能问题并挂起。此外,可能并非所有这些长时间运行的操作都是立即必要的,如果它们立即开始,可能会造成资源和时间上的浪费。
一个更好的策略是在需要时才执行长时间运行的操作,并且仅当需要时,这也有助于减少系统中的内存压力。实际上,延迟评估也导致高效的内存管理,由于内存消耗降低,从而提高性能。在这种情况下,懒惰是更有效率的。在受管理的编程语言(如 C#、Java 和 F#)中减少不必要的和昂贵的垃圾收集清理,可以使程序运行得更快。
2.7.1 用于理解并发行为的严格语言
与延迟评估相反的是急切评估,也称为严格评估,这意味着表达式会立即被评估。C#和 F#以及大多数其他主流编程语言都是严格语言。
命令式编程语言没有内部模型来包含和控制副作用,因此它们被急切地评估是合理的。为了理解程序如何执行,严格评估的语言必须知道副作用(如 I/O)运行的顺序,这使得理解程序执行变得容易。实际上,严格语言可以分析计算,并对必须完成的工作有一个大致的了解。
由于 C#和 F#都不是纯函数式编程语言,因此不能保证每个值都是引用透明的;因此,它们不能是延迟评估的编程语言。
通常,延迟评估难以与命令式特性混合,因为命令式特性有时会引入副作用,例如异常和 I/O 操作,因为操作顺序变得非确定性。有关更多信息,我推荐阅读 John Hughes 的《Why Functional Programming Matters》(mng.bz/qp3B)。
在函数式编程(FP)中,延迟评估和副作用不能共存。尽管在命令式语言中添加延迟评估的概念是可能的,但与副作用的结合会使程序变得复杂。实际上,延迟评估迫使开发者根据程序哪些部分被评估来移除执行顺序的约束和依赖。编写带有副作用的程序可能会变得困难,因为它需要函数执行顺序的概念,这减少了代码模块化和组合性的机会。函数式编程旨在明确副作用,了解它们,并提供工具来隔离和控制它们。例如,Haskell 使用函数式编程语言的约定,用IO类型标识带有副作用的函数。以下是一个 Haskell 函数定义,它读取文件,导致副作用:
readFile :: **IO** ()
这个明确的定义通知编译器存在副作用,然后编译器根据需要应用优化和验证。
懒加载在多核和多线程程序中成为一个重要的技术。为了支持这项技术,Microsoft(从 Framework 4.0 开始)引入了一个名为Lazy<T>的泛型类型构造函数,它简化了以线程安全的方式延迟创建对象的初始化。以下是懒对象Person的定义。
列表 2.26 Person 对象的懒初始化
class Person { ①
public readonly string FullName; ②
public Person(string firstName, string lastName)
{
FullName = firstName + " " + lastName;
Console.WriteLine(FullName);
}
}
Lazy<Person> fredFlintstone = new Lazy<Person>(() =>
➥ new Person("Fred", "Flintstone"), true); ③
Person[] freds = new Person[5]; ④
for(int i = 0;i < freds.Length;i++)
freds[i] = fredFlintstone.Value; ⑤
// output
Fred Flintstone
在示例中,你定义了一个简单的Person类,它有一个只读字段,这也导致FullName在控制台上打印。然后,你通过向Lazy<Person>提供工厂委托来为这个对象创建一个懒初始化器,该委托负责对象实例化。在这种情况下,使用 lambda 表达式代替工厂委托是方便的。图 2.4 展示了这一点。

图 2.4 Person 对象的值仅在第一次访问Value属性时初始化。后续调用返回相同的缓存值。如果你有一个Lazy<Person>对象的数组,当访问数组中的项目时,只有第一个被初始化。其他的将重用缓存结果。
当需要实际评估表达式以使用底层对象Person时,你访问标识符上的Value属性,这将迫使Lazy对象的工厂委托只执行一次(如果值尚未实现)。无论连续调用多少次或多少线程同时访问懒加载初始化器,它们都等待同一个实例。为了证明这一点,列表创建了一个包含五个Person的数组,在for循环中进行初始化。在每次迭代中,通过调用标识符属性Value来检索Person对象,即使它被调用五次,输出(Fred Flintstone)也只被调用一次。
2.7.2 懒加载技术及线程安全的单例模式
在 .NET 中,懒加载被认为是一种缓存技术,因为它会记住已执行的操作的结果,程序可以通过避免重复和重复的操作来运行得更高效。
因为执行操作是在需要时进行的,更重要的是,只进行一次,所以Lazy<T>结构是推荐用来实现单例模式的机制。单例模式创建给定资源的单个实例,该实例在代码的多个部分中共享。这个资源只需要初始化一次,即第一次访问时,这正是Lazy<T>的行为。
在 .NET 中,你有不同的方法来实现单例模式,但其中某些技术存在局限性,例如无法保证线程安全或丢失懒加载实例化。4 Lazy<T>结构提供了一个更好、更简单的单例设计,它确保了真正的懒加载和线程安全,如下所示。
列表 2.27 使用 Lazy<T> 的单例模式
public sealed class Singleton
{
private static readonly Lazy<Singleton> lazy =
new Lazy<Singleton>(() => new Singleton(), true); ①
public static Singleton Instance => lazy.Value;
private Singleton()
{ }
}
Lazy<T> 原始类型还接受一个布尔标志,作为 lambda 表达式之后的可选参数,以启用线程安全行为。这实现了一个复杂且轻量级的双重检查锁定模式。
此属性保证了对象的初始化是线程安全的。当标志被启用时,这是默认模式,无论多少线程调用 SingletonLazyInitializer,所有线程都将收到相同的实例,该实例在第一次调用后进行缓存。这是一个巨大的优势,没有它,你就必须手动保护并确保共享字段的线程安全。
重要的是要强调,如果懒计算的对象实现是线程安全的,这并不意味着它的所有属性也都是线程安全的。
2.7.3 F# 中的懒支持
F# 支持与 Lazy<T> 类型相同的类型,并增加了懒计算功能,该功能类型为 Lazy<T>,其中用于 T 的实际泛型类型由表达式的结果确定。F# 标准库自动强制互斥,因此当从不同的线程同时强制相同的懒值时,纯函数代码是线程安全的。F# 对 Lazy 类型的使用与 C# 略有不同,在 C# 中,你将函数包装在 Lazy 数据类型周围。此代码示例展示了 F# 中 Lazy 计算一个 Person 对象:
let barneyRubble = lazy( Person("barney", "rubble") )
printfn "%s" (barneyRubble.Force().FullName)
函数 barneyRubble 创建了一个 Lazy<Person> 实例,其值尚未实现。然后,为了强制计算,你调用 Force 方法来按需检索值。
2.7.4 懒和 Task,强大的组合
由于性能和可扩展性的原因,在并发应用程序中,结合一个可以独立线程按需执行的懒计算非常有用。Lazy<T> 初始化器可以用来实现一个有用的模式,用于实例化需要异步操作的对象。让我们考虑在前一节中使用过的 Person 类。如果第一个和第二个名字字段是从数据库加载的,你可以应用类型 Lazy<Task<Person>> 来延迟 I/O 计算。有趣的是,在 Task<T> 和 Lazy<T> 之间存在共性:两者都恰好评估一次给定的表达式。
列表 2.29 初始化 Person 对象的懒异步操作
Lazy<Task<Person>> person =
new Lazy<Task<Person>>(async () => ①
{
using (var cmd = new SqlCommand(cmdText, conn))
using (var reader = await cmd.ExecuteReaderAsync())
{
if (await reader.ReadAsync())
{
string firstName = reader["first_name"].ToString();
string lastName = reader["last_name"].ToString();
return new Person(firstName, lastName);
}
}
throw new Exception("Failed to fetch Person");
});
async Task<Person> FetchPerson()
{
return await person.Value; ②
}
在此示例中,委托返回一个 Task<Person>,它异步确定值一次,并将值返回给所有调用者。这些是最终提高程序可扩展性的设计。在示例中,此功能使用 async-await 关键字(在 C# 5.0 中引入)实现异步操作。第八章详细介绍了异步性和可扩展性的主题。
这是一个有用的设计,可以提高程序的可扩展性和并行性。但存在一个微妙的风险。因为 lambda 表达式是异步的,它可以在调用Value的任何线程上执行,表达式将在上下文中运行。更好的解决方案是将表达式包装在底层的Task中,这将强制在线程池线程上执行异步执行。此列表显示了首选模式。
列表 2.30 更好的模式
Lazy<Task<Person>> person =
new Lazy<Task<Person>>(() => Task.Run(
async () =>
{
using (var cmd = new SqlCommand(cmdText, conn))
using (var reader = await cmd.ExecuteReaderAsync())
{
if(await reader.ReadAsync())
{
string firstName = reader["first_name"].ToString();
string lastName = reader["last_name"].ToString();
return new Person(firstName, lastName);
} else throw new Exception(“No record available”);
}
}
));
摘要
-
函数组合将一个函数的结果应用于另一个函数的输入,创建一个新的函数。在 FP 中,你可以通过将复杂问题分解为更小、更简单的问题来解决复杂问题,这些问题更容易解决,然后最终将这些解决方案组合在一起。
-
闭包是其父方法中内联的委托/匿名方法,其中可以在匿名方法内部引用父方法体中定义的变量。闭包提供了一种方便的方式,即使超出作用域,也能让函数访问局部状态(该状态被封装在函数中)。它是设计包含记忆化、延迟初始化和预计算以提高计算速度的函数式编程代码段的基础。
-
记忆化是一种函数式编程技术,它维护中间计算的结果,而不是重新计算它们。它被认为是一种缓存形式。
-
预计算是一种执行初始计算的技术,该计算生成一系列结果,通常以查找表的形式。这些预计算值可以直接从算法中使用,以避免每次代码执行时进行不必要的、重复的且昂贵的计算。通常,预计算取代了记忆化,并与部分应用函数结合使用。
-
懒初始化是缓存的另一种变体。具体来说,这种技术将工厂函数的计算延迟到对象实例化所需时,仅创建一次对象。懒初始化的主要目的是通过减少内存消耗和避免不必要的计算来提高性能。
函数式数据结构和不可变性
本章涵盖
-
使用函数式数据结构构建并行应用程序
-
使用不可变性实现高性能、无锁的代码
-
使用函数式递归实现并行模式
-
在 C#和 F#中实现不可变对象
-
与树数据结构一起工作
数据以多种形式存在。因此,许多计算机程序围绕两个主要约束组织起来并不令人惊讶:数据和数据处理。函数式编程很好地融入了这个世界,因为从很大程度上说,这种编程范式是关于数据转换的。函数式转换允许你将一组结构化数据从其原始形式转换为另一种形式,而无需担心副作用或状态。例如,你可以使用映射函数将一组国家转换为城市集合,同时保持初始数据不变。副作用是并发编程的一个关键挑战,因为一个线程中产生的效果可能会影响另一个线程的行为。
在过去的几年里,主流编程语言添加了新功能,使多线程应用程序的开发变得更加容易。例如,微软已经将 TPL 和async/await关键字添加到.NET 框架中,以减少程序员在实现并发代码时的担忧。但是,当涉及多个线程时,仍然存在保护可变状态免受损坏的挑战。好消息是,FP 让你能够编写无副作用的代码来转换不可变数据。
在本章中,你将学习如何使用函数式数据结构和不可变状态编写并发代码,在并发环境中采用合适的数据结构来轻松提高性能。函数式数据结构通过在线程之间共享数据结构并在无需同步的情况下并行运行来提高性能。
作为本章的第一步,你将使用 C#和 F#开发一个函数式列表。这些练习对于理解不可变函数式数据结构的工作方式非常有用。接下来,我们将介绍不可变树数据结构,你将学习如何在 FP 中使用递归并行构建二叉树结构。在示例中,并行递归用于同时从网络上下载多个图像。
到本章结束时,你将利用不可变性和函数式数据结构来并行运行程序,从而更快地运行,避免共享可变状态的陷阱,如竞态条件。换句话说,如果你想实现并发和强正确性保证,你必须放弃修改。
3.1 现实世界示例:寻找线程不安全对象
在受控环境中构建软件通常不会导致不愉快的惊喜。不幸的是,如果你在本地机器上编写的程序被部署到不受你控制的服务器上,这可能会引入不同的变量。在生产环境中,程序可能会遇到未预料到的问题和不可预测的重负载。我相信在你的职业生涯中,你不止一次听说过,“在我的机器上它运行正常。”
当软件上线时,多个因素可能会出错,导致程序行为不可靠。不久前,我的老板打电话让我分析一个生产问题。该应用程序是一个简单的聊天系统,用于客户支持。程序使用 Web 套接字从前端直接与用 C#编写的 Windows 服务器中心点通信。建立客户端和服务器之间双向通信的底层技术是 Microsoft SignalR (mng.bz/Fal1)。参见图 3.1。
在部署到生产环境之前,程序已经通过了所有测试。然而,一旦部署,服务器的资源就承受了压力。CPU 使用率持续在 85%到 95%的容量之间,由于阻止系统对传入请求做出响应,从而对整体性能产生了负面影响。结果是不可接受的,问题需要迅速解决。

图 3.1 使用 SignalR 中心点的 Web 服务器聊天应用程序的架构。连接的客户端注册在本地静态字典(查找表)中,其实例是共享的。
正如夏洛克·福尔摩斯所说,“当你排除了所有不可能的情况,无论多么不可能,剩下的,必然是真相。”我戴上我的超级侦探帽,然后,使用一个宝贵的视角,我开始审视代码。经过调试和调查,我发现了导致瓶颈的代码部分。
我使用了一个性能分析工具来分析应用程序的性能。对应用程序进行采样和性能分析是寻找应用程序瓶颈的好起点。性能分析工具在程序运行时进行采样,检查执行时间以检查常规数据。收集到的数据是应用程序中执行最多工作的单个方法的统计性能分析表示。最终报告显示了这些方法,可以通过查找热点路径(mng.bz/agzj)来检查,大多数应用程序的工作都在这里执行。
高 CPU 核心利用率问题源于OnConnected和OnDisconnected方法中共享状态的竞争。在这种情况下,共享状态是一个通用的Dictionary类型,用于在内存中保持连接的用户。线程竞争是一种条件,其中一个线程正在等待另一个线程持有的对象被释放。等待的线程无法继续执行,直到另一个线程释放对象(它被锁定)。下面的列表显示了有问题的服务器代码。
列表 3.1 C#中的 SignalR 中心点,用于在上下文中注册连接
static Dictionary<Guid, string> onlineUsers =
new Dictionary<Guid, string>(); ①
public override Task OnConnected() {
Guid connectionId = new Guid (Context.ConnectionId); ②
System.Security.Principal.IPrincipal user = Context.User;
string userName;
if (!onlineUsers.TryGetValue(connectionId, out userName)){ ③
RegisterUserConnection (connectionId, user.Identity.Name);
onlineUsers.Add(connectionId, user.Identity.Name); ④
}
return base.OnConnected();
}
public override Task OnDisconnected() {
Guid connectionId = new Guid (Context.ConnectionId);
string userName;
if (onlineUsers.TryGetValue(connectionId, out userName)){ ③
DeregisterUserConnection(connectionId, userName);
onlineUsers.Remove(connectionId); ④
}
return base.OnDisconnected();
}
操作 OnConnected 和 OnDisconnected 依赖于一个共享的全局字典,在这些类型的程序中共同使用以维护本地状态。请注意,每次执行这些方法之一时,底层集合会被调用两次。程序逻辑检查 用户连接 ID 是否存在,并据此应用一些行为:
string userName;
if (!onlineUsers.TryGetValue(connectionId, out userName)){
你能看出问题吗?对于每个新的客户端请求,都会建立一个新连接,并创建一个新的 hub 实例。本地状态由一个静态变量维护,它跟踪当前用户连接,并由 hub 的所有实例共享。根据微软的文档,“静态构造函数只被调用一次,静态类在程序所在的程序域的生命周期内保持内存中。”^(1)
这里是用于用户连接跟踪的集合:
static Dictionary<Guid, string> onlineUsers = new Dictionary<Guid, string>();
Guid 是 SignalR 在客户端和服务器之间建立连接时创建的唯一连接标识符。该字符串表示在登录期间定义的用户名称。在这种情况下,程序显然是在一个多线程环境中运行的。每个传入请求都是一个新线程;因此,将会有多个请求同时访问共享状态,这最终会导致多线程问题。
MSDN 文档在这方面很明确。它说,只要集合没有被修改,Dictionary 集合可以支持并发多个读取者。^(2) 遍历集合本身不是线程安全的,因为一个线程可能在另一个线程更改集合状态的同时更新字典。
存在几种可能的解决方案来避免这种限制。第一种方法是使集合线程安全,并允许多个线程通过 lock 原语进行 read 和 write 操作。这个解决方案是正确的,但会降低性能。
更好的替代方案是在不进行同步的情况下达到相同的线程安全级别;例如,使用不可变集合。
3.1.1 .NET 不可变集合:一个安全解决方案
微软在 .NET Framework 4.5 中引入了不可变集合,位于 System.Collections.Immutable 命名空间中。这是在 .NET 4.0 中的 TPL 之后和 .NET 4.5 之后的 async 和 await 关键字之后的线程工具演变的一部分。
不可变集合遵循本章中介绍的函数式范式概念,并在多线程应用程序中提供隐式线程安全,以克服维护和控制可变状态带来的挑战。类似于并发集合,它们也是线程安全的,但底层实现不同。任何更改数据结构的操作都不会修改原始实例。相反,它们返回一个更改后的副本,并保持原始实例不变。不可变集合已经针对最大性能进行了大量优化,并使用结构共享^(3)模式来最小化垃圾收集器(GC)的需求。例如,以下代码片段从一个泛型可变集合创建一个不可变集合(不可变命令以粗体显示)。然后,通过更新集合以添加新项目,创建一个新的集合,而原始集合不受影响:
var original = new Dictionary<int, int>().**ToImmutableDictionary**();
var modifiedCollection = original.Add(key, value);
在一个线程中对集合的任何更改对其他线程都是不可见的,因为它们仍然引用原始未修改的集合,这也是不可变集合天生线程安全的原因。
表 3.1 展示了为每个相关的可变泛型集合实现的一个不可变集合的实现。
表 3.1 .NET Framework 4.5 的不可变集合
| 不可变集合 | 可变集合 |
|---|---|
ImmutableList<T> |
List<T> |
ImmutableDictionary<TKey, TValue> |
Dictionary<TKey, TValue> |
ImmutableHashSet<T> |
HashSet<T> |
ImmutableStack<T> |
Stack<T> |
ImmutableQueue<T> |
Queue<T> |
这里有两种创建不可变列表的方法。
列表 3.2 构建.NET 不可变集合
var list = ImmutableList.Create<int>(); ①
list = list.Add(1); ②
list = list.Add(2);
list = list.Add(3);
var builder = ImmutableList.CreateBuilder<int>(); ③
builder.Add(1); ④
builder.Add(2);
builder.Add(3);
list = builder.ToImmutable(); ⑤
第二种方法通过创建一个临时列表构建器来简化列表的构建,该构建器用于向列表添加元素,然后将元素密封(冻结)到不可变结构中。
关于原始聊天程序中的数据损坏(竞态条件)问题,不可变集合可以在 Windows 服务器中心用于维护打开的 SignalR 连接的状态。这可以通过多线程访问安全地完成。幸运的是,System.Collections.Immutable命名空间包含用于查找的Dictionary的等效版本:ImmutableDictionary.。
你可能会问,“但如果集合是不可变的,它是如何更新的同时保持线程安全的?”你可以在涉及读取或写入集合的操作周围使用锁语句。使用锁构建线程安全的集合很简单;但这是一种比所需更昂贵的方法。更好的选择是使用单个比较和交换(CAS)操作来保护写入,这消除了对锁的需求,并使读取操作不受保护。这种无锁技术比对应技术(使用同步原语)更可扩展,性能更好。
CAS 操作
CAS 是一种在多线程编程中使用的特殊指令,作为同步的一种形式,它以原子方式对内存位置执行操作。原子操作要么作为一个单元成功,要么失败。
原子性 指的是在单步中改变状态的操作,使得结果自主,观察结果要么是完成要么是没有完成,没有中间状态。其他并行线程只能看到旧状态或新状态。当一个原子操作在一个共享变量上执行时,线程无法观察到其修改直到它完成。实际上,原子操作读取的是在某一时刻出现的值。原始的原子操作是机器指令,可以通过 .NET 中的 System.Threading.Interlocked 类暴露,例如 Interlocked.CompareExchange 和 Interlocked.Increment 方法。
CAS 指令在不需要获取和释放锁的情况下修改共享数据,并允许极高的并行级别。这正是不可变数据结构真正发光的地方,因为它们最小化了发生 ABA 问题的可能性(en.wikipedia.org/wiki/ABA_problem)。
理念是将必须改变的状态包含在一个单一且最重要的是隔离的不可变对象(在这种情况下,是 ImmutableDictionary)中。因为对象是隔离的,所以没有状态共享;因此,没有需要同步的内容。
下面的列表展示了名为 Atom 的辅助对象的实现。这个名字受到了 Clojure 原子(clojure.org/reference/atoms)的启发,它内部使用 Interlocked.CompareExchange 操作符来执行原子 CAS 操作。
列表 3.3 执行 CAS 指令的 Atom 对象
public sealed class Atom<T> where T : class ①
{
public Atom(T value)
{
this.value = value;
}
private volatile T value;
public T Value => value; ②
public T Swap(Func<T, T> factory) ③
{
T original, temp;
do {
original = value;
temp = factory (original);
}
while (Interlocked.CompareExchange(ref value, temp, original)
➥ != original); ④
return original;
}
}
Atom 类封装了一个标记为 volatile 的类型 T 的引用对象,为了实现正确的值交换行为,该对象必须是不可变的。Value 属性用于读取包装对象的当前状态。Swap 函数的目的是执行 CAS 指令,通过 factory 委托将基于前一个值的新值传递给此函数的调用者。CAS 操作接受一个旧值和一个新值,并且只有当当前值等于传入的旧值时,才原子地将 Atom 设置为新值。如果 Swap 函数无法使用 Interlocked.CompareExchange 设置新值,它将继续重试,直到成功。
列表 3.4 展示了如何在 SignalR 服务器端点的上下文中使用 Atom 类和 ImmutableDictionary 对象。代码仅实现了 OnConnected 方法。同样的概念也适用于 OnDisconnected 函数。
列表 3.4 使用 Atom 对象的线程安全 ImmutableDictionary
Atom<ImmutableDictionary<Guid, string>> onlineUsers =
new Atom<ImmutableDictionary<Guid, string>>
(ImmutableDictionary<Guid, string>.Empty); ①
public override Task OnConnected() {
Grid connectionId = new Guid (Context.ConnectionId);
System.Security.Principal.IPrincipal user = Context.User;
var temp = onlineUsers.Value; ②
if(onlineUsers.Swap(d => { ③
if (d.ContainsKey(connectionId)) return d;
return d.Add(connectionId, user.Identity.Name);
}) != temp) { ④
RegisterUserConnection (connectionId, user.Identity.Name);
}
return base.OnConnected();
}
Atom Swap方法封装了对底层ImmutableDictionary的更新调用。Atom Value属性可以在任何时候访问,以检查当前打开的 SignalR 连接。此操作是线程安全的,因为它只读。Atom类是泛型的,它可以用来原子地更新任何类型。但是不可变集合有一个专门的辅助类(将在下文中描述)。
ImmutableInterlocked类
由于您需要以线程安全的方式更新不可变集合,Microsoft 引入了ImmutableInterlocked类,该类可在System.Collections.Immutable命名空间中找到。此类提供了一组函数,用于处理使用之前提到的 CAS 机制更新不可变集合。它公开了与Atom对象相同的功能。在此列表中,ImmutableDictionary替换了Dictionary。
列表 3.5 使用ImmutableDictionary维护打开连接的中心
static ImmutableDictionary<Guid, string> onlineUsers =
ImmutableDictionary<Guid, string>.Empty; ①
public override Task OnConnected() {
Grid connectionId = new Guid (Context.ConnectionId);
System.Security.Principal.IPrincipal user = Context.User;
if(ImmutableInterlocked.TryAdd (ref onlineUsers,
➥ connectionId, user.Identity.Name)) { ②
RegisterUserConnection (connectionId, user.Identity.Name);
}
return base.OnConnected();
}
public override Task OnDisconnected() {
Grid connectionId = new Guid (Context.ConnectionId);
string userName;
if(ImmutableInterlocked.TryRemove (ref onlineUsers,
➥ connectionId, out userName)) { ③
DeregisterUserConnection(connectionId, userName);
}
return base.OnDisconnected();
}
更新ImmutableDictionary是原子性的,这意味着在这种情况下,只有当用户连接不存在时才会添加。随着这一变化,SignalR 中心工作正常且无锁,服务器 CPU 利用率没有大幅上升。但是,使用不可变集合进行频繁更新的代价是存在的。例如,使用ImmutableInterlocked将 100 万用户添加到ImmutableDictionary所需的时间是 2.518 秒。这个值在大多数情况下可能是可接受的,但如果您旨在构建一个高性能的系统,那么进行研究和采用正确的工具进行工作是非常重要的。
通常,不可变集合的使用非常适合不同线程之间的共享状态,当更新次数较低时。它们的值(状态)保证是线程安全的;它可以在额外的线程之间安全地传递。如果您需要一个必须同时处理许多更新的集合,则更好的解决方案是利用.NET 并发集合。
3.1.2 .NET 并发集合:一个更快的解决方案
在.NET 框架中,System.Collections.Concurrent命名空间提供了一组线程安全的集合,旨在简化对共享数据的线程安全访问。并发集合是可变的集合实例,旨在提高多线程应用程序的性能和可伸缩性。由于它们可以同时被多个线程安全地访问和更新,因此建议在多线程程序中使用它们,而不是System.Collections.Generic中类似集合。表 3.2 显示了.NET 中可用的并发集合。
表 3.2 并发集合详细信息
| 并发集合 | 实现细节 | 同步技术 |
|---|---|---|
ConcurrentBag<T> |
类似于泛型列表 | 如果检测到多个线程,则使用原始监视器协调它们的访问;否则,避免同步。 |
ConcurrentStack<T> |
使用单链表实现的泛型栈 | 使用 CAS 技术实现无锁。 |
ConcurrentQueue<T> |
使用数组段链表实现的泛型队列 | 使用 CAS 技术实现无锁。 |
ConcurrentDictionary<K, V> |
使用哈希表实现的泛型字典 | 读取操作无锁;更新操作使用锁同步。 |
回到“寻找线程不安全对象”的 SignalR hub 示例,ConcurrentDictionary比不安全的Dictionary更好,而且由于频繁和大量的更新,它也比ImmutableDictionary更好。实际上,System.Collections.Concurrent已经设计为使用细粒度^(5)和锁-free 模式来提高性能。这些技术确保访问并发集合的线程被阻塞的时间最短,或者在某些情况下,完全避免阻塞。
ConcurrentDictionary可以在处理每秒多个请求的同时确保可伸缩性。您可以使用方括号索引像传统的泛型Dictionary一样分配和检索值,但ConcurrentDictionary还提供了一些并发友好的方法,例如AddOrUpdate或GetOrAdd。AddOrUpdate方法接受一个键和一个值参数,以及一个代表参数。如果键不在字典中,它将使用值参数插入。如果键在字典中,将调用代表,并使用结果值更新字典。在代表中提供您要执行的操作也是线程安全的,这消除了另一个线程在您从字典中读取值和写入另一个值之间更改字典的风险。
在以下列表中,ConcurrentDictionary在 SignalR hub 中保持打开连接的状态。
列表 3.6 使用ConcurrentDictionary维护打开的连接
static ConcurrentDictionary<Guid, string> onlineUsers =
new ConcurrentDictionary<Guid, string>(); ①
public override Task OnConnected() {
Grid connectionId = new Guid (Context.ConnectionId);
System.Security.Principal.IPrincipal user = Context.User;
if(onlineUsers.TryAdd(connectionId, user.Identity.Name)) { ②
RegisterUserConnection (connectionId, user.Identity.Name);
}
return base.OnConnected();
}
public override Task OnDisconnected() {
Grid connectionId = new Guid (Context.ConnectionId);
string userName;
if(onlineUsers.TryRemove (connectionId, out userName)) { ③
DeregisterUserConnection(connectionId, userName);
}
return base.OnDisconnected();
}
代码看起来与使用ImmutableDictionary的代码列表(列表 3.5)相似,但在添加和删除许多连接(connection)的性能上更快。例如,将 100 万用户添加到ConcurentDictionarry所需的时间仅为 52 毫秒,而ImmutableDictionary则需要 2.518 秒。这个值在许多情况下可能足够好,但如果您想构建一个高性能的系统,那么研究和采用正确的工具是非常重要的。
您需要了解这些集合是如何工作的。最初,由于它们的可变特性,似乎这些集合在使用时没有采用任何 FP 风格。但是,集合创建了一个内部快照,模拟了临时的不可变性,以在迭代期间保持线程安全,允许安全地枚举快照。
并发集合与考虑生产者/消费者^(6) 实现的算法配合良好。生产者/消费者模式旨在将工作负载在一名或多名生产者和一名或多名消费者之间进行分区和平衡。一个生产者在一个独立的线程中生成数据并将其插入到队列中。一个消费者运行一个并行的单独线程,从队列中消费数据。例如,一个生产者可以下载图片并将它们存储在一个由执行图像处理的消费者访问的队列中。这两个实体独立工作,如果生产者的工作负载增加,你可以启动一个新的消费者来平衡工作负载。生产者/消费者模式是应用最广泛的并行编程模式之一,它将在第七章中讨论和实现。
3.1.3 代理消息传递模式:更快、更好的解决方案
“寻找线程不安全对象”的最终解决方案是在 SignalR 中心引入本地代理,它提供了异步访问以在高流量访问期间保持高可伸缩性。代理是一个计算单元,一次处理一条消息,消息是异步发送的,这意味着发送者不需要等待答案,因此没有阻塞。在这种情况下,字典被隔离,只能由代理访问,它以单线程方式更新集合,消除了数据损坏的风险和锁的需求。这个修复是可伸缩的,因为代理的异步语义操作可以每秒处理 300 万条消息,代码运行得更快,因为它消除了使用同步的额外开销。
在第十一章中讨论了使用代理和消息传递进行编程。如果你不完全理解代码,不要担心;在这次旅程中它会变得清晰,你始终可以参考附录 B。与之前的解决方案相比,这种方法需要更少的代码更改,但不会影响应用程序的性能。此列表显示了在 F#中实现代理的方法。
列表 3.7 确保对可变状态线程安全访问的 F#代理
type AgentMessage = ①
| AddIfNoExists of id:Guid * userName:string
| RemoveIfNoExists of id:Guid
type AgentOnlineUsers() =
let agent = MailboxProcessor<AgentMessage>.Start(fun inbox ->
let onlineUsers = Dictionary<Guid, string>() ②
let rec loop() = async {
let! msg = inbox.Receive()
match msg with
| AddIfNoExists(id, userName) -> ③
let exists, _ = onlineUsers.TryGetValue(id) ④
if not exists = true then
onlineUsers.Add(id, userName)
RegisterUserConnection (id, userName)
| RemoveIfNoExists(id) -> ③
let exists, userName = onlineUsers.TryGetValue(id) ④
if exists = true then
onlineUsers.Remove(id) |> ignore
DeregisterUserConnection(id, userName)
return! loop() }
loop() )
在以下列表中,重构的 C#代码使用了最终解决方案。由于.NET 编程语言之间的互操作性,可以使用一种语言开发库,而另一种语言可以访问它。在这种情况下,C#使用MailboxProcessor(Agent)代码访问 F#库。
列表 3.8 使用 F#代理的 C# SignalR 中心
static AgentOnlineUsers onlineUsers = new AgentOnlineUsers() ①
public override Task OnConnected() {
Guid connectionId = new Guid (Context.ConnectionId);
System.Security.Principal.IPrincipal user = Context.User;
onlineUsers.AddIfNoExists(connectionId, user.Identity.Name); ②
return base.OnConnected();
}
public override Task OnDisconnected() {
Guid connectionId = new Guid (Context.ConnectionId);
onlineUsers.RemoveIfNoExists(connectionId); ②
return base.OnDisconnected();
}
总结来说,最终解决方案通过将 CPU 消耗大幅降低到几乎为零(图 3.2)解决了问题。
从这次经验中得到的启示是,在多线程环境中共享可变状态不是一个好主意。最初,Dictionary 集合必须维护当前在线的用户连接;可变性几乎是必需的。你可以使用一个不可变结构的功能方法,但相反,为每次更新创建一个新的集合,这可能是过度设计。更好的解决方案是使用代理来隔离可变性,并使代理可以从调用方法中访问。这是一种使用代理的自然线程安全性的功能方法。
这种方法的结果是可扩展性的增加,因为访问是异步的,不会阻塞,并且它允许你轻松地在代理体中添加逻辑,例如记录日志和错误处理。

图 3.2 使用 SignalR hub 的聊天应用程序的 Web 服务器架构。与图 3.1 相比,此解决方案移除了在多个线程之间共享的可变字典,以处理传入的请求。为了替换字典,有一个本地代理保证了在此多线程场景中的高可扩展性和线程安全性。
3.2 在线程间安全地共享功能性数据结构
持久性数据结构(也称为功能性数据结构)是一种数据结构,其中没有任何操作会导致底层结构发生永久性更改。持久性意味着所有修改过的结构版本都会随着时间的推移而持续存在。换句话说,这样的数据结构是不可变的,因为更新操作不会修改数据结构,而是返回一个新的具有更新值的结构。
在数据方面,持久性通常被误解为将数据存储在物理实体中,例如数据库或文件系统。在函数式编程(FP)中,功能性数据结构是持久的。大多数传统的命令式数据结构(例如来自 System.Collections.Generic: 的 Dictionary、List、Queue、Stack 等等)是短暂的,因为它们的状态仅在更新之间存在很短的时间。更新是破坏性的,如图 3.3 所示。
功能性数据结构保证无论结构是否被不同的执行线程或甚至不同的进程访问,都能保持一致的行为,无需担心数据可能的变化。持久性数据结构不支持破坏性更新,而是保留数据结构的旧版本。
可以理解的是,与传统命令式数据结构相比,纯函数式数据结构在内存分配上非常密集,这导致了性能的显著下降。幸运的是,持久数据结构在设计时考虑了效率,通过仔细地在数据结构的版本之间重用公共状态。这是通过使用函数式数据结构的不可变性实现的:因为它们永远不会改变,所以重用不同版本是毫不费力的。你可以通过引用现有数据而不是复制它,从旧数据的一部分组合成一个新的数据结构。这种技术被称为结构共享(见 3.3.5 节)。这种实现比每次更新时创建新的数据副本更加高效,从而提高了性能。

图 3.3 列表的破坏性更新与持久更新。右侧的列表正在原地更新,将值 3 替换为 5,而不保留原始列表。这个过程也被称为破坏性更新。左侧的函数式列表不修改其值,而是创建一个包含更新值的新列表。
3.3 更改的不可变性
在《有效处理遗留代码》一书中,作者迈克尔·费瑟斯将面向对象编程(OOP)和函数式编程(FP)比较如下:
面向对象编程通过封装移动部分使代码易于理解。函数式编程通过最小化移动部分使代码易于理解。
— 迈克尔·费瑟斯,有效处理遗留代码(普伦蒂斯·霍尔,2004 年)
这意味着不可变性最小化了代码中需要更改的部分,使得推理这些部分的行为变得更加容易。不可变性使函数式代码免受副作用的影响。共享变量,作为副作用的一个例子,是创建并行代码的一个严重障碍,并导致非确定性执行。通过消除副作用,你可以拥有良好的编码方法。
例如,在.NET 中,框架设计者决定使用函数式方法将string构建为不可变对象,以使编写更好的代码变得更加容易。正如你所回忆的,不可变对象是指创建后其状态不能被修改的对象。在编码风格中采用不可变性以及由此产生的学习曲线需要额外的关注;但结果将得到更简洁的代码语法和简化(减少不必要的样板代码),这将非常值得努力。此外,采用这种数据转换而非数据修改的方法,可以显著降低代码中错误的可能性,并且代码库不同部分之间的交互和依赖关系更容易管理。
将不可变对象作为编程模型的一部分,迫使每个线程处理其自己的数据副本,这有助于编写正确的并发代码。此外,如果访问是只读的,则可以安全地让多个线程同时访问共享数据。实际上,因为你不需要锁或同步技术,所以可能的死锁和竞态条件的风险永远不会发生(图 3.4)。我们在第一章讨论了这些技术。

图 3.4 使用可变或不可变状态与共享或非共享状态结合的笛卡尔表示
函数式语言,如 F#,默认是不可变的,这使得它们非常适合并发。不可变性不会立即使你的代码运行得更快或使你的程序大规模可扩展,但它确实为你的代码在代码库中做小的改动后并行化做好了准备。
在面向对象的语言,如 C# 和 Java 中,编写并发应用程序可能很困难,因为可变性是默认行为,而且没有工具可以帮助防止或抵消它。在命令式编程语言中,可变数据结构被认为是完全正常的,尽管全局状态不被推荐,但可变状态通常在程序的不同区域之间共享。这在并行编程中是一个灾难性的配方。幸运的是,如前所述,C# 和 F# 编译后共享相同的中间语言,这使得功能共享变得容易。你可以在 F# 中定义程序的领域和对象,利用其类型和简洁性(最重要的是,其类型默认是不可变的),例如。然后,用 C# 开发你的程序以使用 F# 库,这保证了不可变行为而无需额外的工作。
不变性是构建并发应用程序的重要工具,但使用不可变类型并不会使程序运行得更快。但它确实使代码准备好并行执行;不可变性促进了更高程度的并发,这在多核计算机中转化为更好的性能和速度。不可变对象可以在多个线程之间安全地共享,避免了锁同步的需要,这可以使程序并行运行。
.NET 框架提供了几种不可变类型——有些是函数式的,有些可用于多线程程序,有些两者都是。表 3.3 列出了这些类型的特性,这些内容将在本章后面讨论。
表 3.3 .NET 框架不可变类型的特性
| 类型 | .NET 语言 | 是否是函数式 | 特性 | 线程安全 | 利用率 |
|---|---|---|---|---|---|
| F# 列表 | F# | 是 | 快速追加插入的不可变链表 | 是 | 与递归结合用于构建和遍历 n-元素列表 |
| 数组 | C# 和 F# | 否 | 存储在连续内存位置中的零索引可变数组类型 | 是,带有分区^(a) | 高效的数据存储,用于快速访问 |
| 并发集合 | C# 和 F# | 否 | 优化多线程读写访问的集合集 | 是 | 多线程程序中的共享数据;非常适合生产者/消费者模式 |
| 不可变集合 | C# 和 F# | 是 | 一组使处理并行计算环境更容易的集合;它们的值可以在不同的线程之间自由传递,而不会产生数据损坏 | 是 | 在涉及多个线程时保持状态可控 |
| 区分联合(DU) | F# | 是 | 表示存储几种可能选项之一的数据类型 | 是 | 常用于建模领域和表示像抽象语法树这样的层次结构 |
| 元组 | C# 和 F# | 是 | 将两个或更多任意(可能不同)类型的值组合在一起的类型 | 否 | 用于从函数返回多个值 |
| F# 元组 | F# | 是 | 是 | ||
| 记录类型 | F# | 是 | 表示命名值属性聚合的类型;可以看作是具有命名成员的元组,可以使用点符号访问 | 是 | 用作提供不可变语义的常规类的替代品;非常适合领域设计,如 DU,并且可以在 C# 中使用 |
3.3.1 用于数据并行的函数数据结构
不可变数据结构非常适合数据并行,因为它们以高效的无拷贝方式促进了不同任务之间的数据共享。实际上,当多个线程并行访问可分数据时,不可变性在安全处理属于同一结构但看似孤立的各个数据块方面起着基本作用。通过采用函数纯净性,即使用避免副作用的函数而不是不可变性,可以实现相同级别的正确数据并行性。
例如,PLINQ 的底层功能促进了 纯净性。一个函数是纯净的,当它没有副作用,并且其返回值仅由其输入值决定时。
PLINQ 是一种位于多线程组件之上的高级抽象语言,它抽象了底层细节,同时仍然暴露了简化的 LINQ 语义。PLINQ 的目标是减少执行时间并提高查询的整体性能,利用所有可用的计算机资源。(PLINQ 在第五章中有详细说明。)
3.3.2 使用不可变性的性能影响
某些程序员认为使用不可变对象进行编程效率低下,并且有严重的性能影响。例如,将元素添加到列表的纯函数式方法是返回一个包含新元素的新列表副本,而原始列表保持不变。这可能会增加 GC 的内存压力。因为每次修改都会返回一个新的值,GC 必须处理大量短生命周期的变量。但是,因为编译器知道现有数据是不可变的,并且因为数据不会改变,编译器可以通过部分或全部重用集合来优化内存分配。因此,使用不可变对象对性能的影响最小,几乎无关紧要,因为典型的对象副本,代替传统的修改,创建了一个浅拷贝。这样,原始对象引用的对象不会被复制;只有引用被复制,这是原始对象的一个小的位操作副本。
与线程安全保证带来的好处相比,与今天 CPU 的速度相比,这几乎是一个微不足道的代价。需要考虑的一个缓解因素是,目前,性能转化为并行编程,这需要更多对象的复制和更大的内存压力。
3.3.3 C#中的不可变性
在 C#中,不可变性不是一个受支持的构造。但是,在 C#中创建不可变对象并不困难;问题在于编译器不会强制执行这种风格,程序员必须通过代码来实现。在 C#中采用不可变性需要额外的努力和更多的细心。在 C#中,可以通过使用关键字const或readonly.来创建不可变对象。
任何字段都可以用const关键字进行修饰;唯一的前提是赋值和声明是单行语句。一旦声明并赋值,const值就不能更改,并且它属于类级别,直接访问它,而不是通过实例。
另一个选项是在类实例化时使用 readonly 关键字装饰值,可以内联进行或通过构造函数进行。在标记为 readonly 的字段初始化后,其值不能更改,并且可以通过类的实例访问其值。更重要的是,当属性或状态需要更改时,为了保持对象不可变,你应该使用更新后的状态创建原始对象的新实例。记住,C# 中的 readonly 对象是第一级不可变和浅不可变的。在 C# 中,一个对象是浅不可变的,当其不可变性不能保证所有字段和属性时,但只保证对象本身。如果一个 Person 对象有一个只读属性 Address,它是一个复杂的对象,暴露了如街道、城市和 ZIP 码等属性,那么如果这些属性没有标记为只读,它们不会继承不可变行为。相反,所有字段和属性都标记为只读的不可变对象是深不可变的。
这个列表展示了 C# 中的不可变类 Person。
列表 3.9 C# 中的浅不可变类 Person
class Address{
public Address(string street, string city, string zipcode){
Street = street;
City = city;
ZipCode = zipcode;
}
public string Street; ①
public string City; ①
public string ZipCode; ①
}
class Person {
public Person(string firstName, string lastName, int age,
➥ Address address){
FirstName = firstName;
LastName = lastName;
Age = age;
Address = address;
}
public readonly string FirstName; ②
public readonly string LastName; ②
public readonly int Age; ②
public readonly Address Address; ②
}
在这段代码中,Person 对象是浅不可变的,因为尽管字段 Address 不可修改(它被标记为只读),但其底层字段可以更改。实际上,你可以创建 Person 和 Address 对象的实例,如下:
Address address = new Address("Brown st.", "Springfield", "55555");
Person person = new Person("John", "Doe", 42, address);
现在,如果你尝试修改字段 Address,编译器会抛出一个异常(粗体),但你仍然可以修改对象 address.ZipCode 的字段:
person.Address = **// Error**
person.Address.ZipCode = "77777";
这是一个浅不可变对象的例子。微软意识到在现代环境中使用不可变性编程的重要性,并引入了一个功能,可以轻松地使用 C# 6.0 创建不可变类。这个功能称为 只读属性自动属性,它允许你声明没有设置器方法的自动属性,这会隐式地创建一个 readonly 后备字段。不幸的是,这实现了浅不可变行为。
列表 3.10 使用只读属性自动属性的 C# 不可变类
class Person {
public Person(string firstName, string lastName, int age,
➥ Address address){
FirstName = firstName;
LastName = lastName;
Age = age;
Address = address;
}
public string FirstName {get;} ①
public string LastName {get;} ①
public int Age {get;} ①
public Address Address {get;} ①
public Person ChangeFirstName(string firstName) { ②
return new Person(firstName, this.LastName, this.Age, this.Address);
}
public Person ChangeLstName(string lastName) { ②
return new Person(this.FirstName, lastName, this.Age, this.Address);
}
public Person ChangeAge(int age) { ②
return new Person(this.FirstName, this.LastName, age, this.Address);
}
public Person ChangeAddress(Address address) { ②
return new Person(this.firstName, this.LastName, this.Age, address);
}
}
在这个 Person 类的不可变版本中,重要的是要注意,负责更新 FirstName、LastName、Age 和 Address 的方法不会修改任何状态;相反,它们会创建一个新的 Person 实例。在面向对象编程中,对象是通过调用构造函数来实例化的,然后通过更新属性和调用方法来设置对象的状态。这种方法导致了一个不便且冗长的构造语法。这就是添加到 Change Person 对象属性的功能发挥作用的地方。使用这些函数,可以采用链式模式,这被称为 流畅接口。以下是一个创建 Person 类实例并更改年龄和地址的示例模式:
Address newAddress = new Address("Red st.", "Gotham", "123459");
Person john = new Person("John", "Doe", 42, address);
Person olderJohn = john.ChangeAge(43).ChangeAddress(newAddress);
总结来说,要在 C# 中使一个类不可变,你必须:
-
总是设计一个具有设置对象状态所使用的参数的构造函数的类。
-
将字段定义为只读,并使用没有公共设置器的属性;值将在构造函数中分配。
-
避免任何旨在修改类内部状态的方法。
3.3.4 F# 中的不可变性
如前所述,编程语言 F# 默认是不可变的。因此,变量的概念不存在,因为根据定义,如果变量是不可变的,那么它就不是变量。F# 用标识符替换变量,该标识符通过关键字 let 与值关联(绑定)。在此关联之后,值不能改变。除了完整的不可变集合之外,F# 还内置了一系列有助于纯函数式编程的不可变构造,如列表 3.11 所示。这些内置类型是 tuple 和 record,并且它们在 CLI 类型之上具有许多优点:
-
它们是不可变的。
-
它们不能为
null。 -
它们具有内置的结构相等性和比较。
此列表展示了在 F# 中使用不可变类型。
列表 3.11 F# 不可变类型
let point = (31, 57) ①
let (x,y) = point ②
type Person= { First : string; Last: string; Age:int} ③
let person = { First="John"; Last="Doe"; Age=42} ④
类型 tuple 是一组无名的有序值集合,可以是不同异构的(en.wikipedia.org/wiki/Homogeneity_and_heterogeneity)类型。Tuple 有即用即用的优点,非常适合定义包含任意数量元素的临时和轻量级结构。例如,(true, “Hello”, 2, 3.14) 是一个四元组。
The type `record` is similar to `tuple`, where each element is labeled, giving a name to each of the values. The advantage of `record` over `tuple` is that the labels help to distinguish and to document what each element is for. Moreover, the properties of a record are automatically created for the fields defined, which is convenient because it saves keystrokes. A record in F# can be considered as a C# class with all properties read-only. Most valuable is the ability to correctly and quickly implement immutable classes in C# by using this type. In fact, it’s possible to create an F# library in your solution by creating your domain model using the `record` type and then reference this library into your C# project. Here’s how C# code looks when it references the F# library with the `record` type: ``` Person person = new Person("John", "Doe", 42) ``` This is a simple and effective way to create an immutable object. Additionally, the F# implementation requires only one line of code, compared to the equivalent in C# (11 lines of code using read-only fields). ### 3.3.5 Functional lists: linking cells in a chain The most common and generally adopted*functional data structure*is the list, which is a series of homogeneous types used to store an arbitrary number of items. In FP, lists are recursive data structures composed by two linked elements: `Head` or `Cons` and `Tail`. The purpose of `Cons` is to provide a mechanism to contain a value and a connection to other `Cons` elements via an object reference pointer. This pointer reference is known as the `Next` pointer.** **Lists also have a special state called `nil` to represent a list with no items, which is the last link connected to anything. The `nil`, or empty, case is convenient during a recursive traverse of a list to determine its end. Figure 3.5 shows a constructed list of four `Cons` cells and an empty list. Each cell (`Head`) contains a number and a reference to the remaining list (`Tail`), until the last `Cons` cell, which defines an empty list. This data structure is similar to a singly linked list ([`en.wikipedia.org/wiki/Linked_list`](https://en.wikipedia.org/wiki/Linked_list)), where each node in the chain has a single link to another node, representing a series of nodes linked together into a chain.  Figure 3.5 Functional list of integers composed by four numbers and an empty list (the last box [ ] on the right). Each item has a reference, the black arrow, linked to the rest of the list. The first item on the left is the head of the list, which is linked to the rest of the list, the tail. In functional lists, the operations to add new elements or remove existing elements don’t modify the current structure but return a new one with the updated values. Under the hood, immutable collections can safely share common structures, which limits memory consumption. This technique is called *structural sharing*. Figure 3.6 shows how structural sharing minimizes memory consumption to generate and update functional lists.  Figure 3.6 The structural sharing technique to create new lists optimizing memory space. In summary, List A has three items plus an empty cell, List B has five, and List C six. Each item is linked to the rest of the list. For example, the head item of List B is the number 4, which is linked to the tail (the numbers 5,1, 2, 3, and [ ]). In figure 3.6, List A is composed of three numbers and an empty list. By adding two new items to List A, the structural sharing technique gives the impression that a new List B is created, but in reality it links a pointer from the two items to the previous and unmodified List A. The same scenario repeats for List C. At this point, all three lists (A, B, and C) are accessible, each with its own elements. Clearly, functional lists are designed to provide better performance by adding or removing items from the head. In fact, lists work well for linear traversal, and appending performs in constant time O(1) because the new item is added to the head of the previous list. But it isn’t efficient for random access because the list must be traversed from the left for each lookup, which has O(*n*) time, where *n* is the number of elements in the collection. A new list is created by prepending a new element to an existing list by taking an empty list as initial value, and then linking the new element to the existing list structure. This operation to `Cons` onto the head of the list is repeated for all items, and consequently, every list terminates with an empty state. One of the biggest attractions of functional lists is the ease with which they can be used to write thread-safe code. In fact, functional data structures can be passed by reference to a callee with no risk of it being corrupted, as shown in figure 3.7.  Figure 3.7 The list is passed by reference to the function caller (callee). Because the list is immutable, multiple threads can access the reference without generating any data corruption. By definition, to be thread safe, an object must preserve a consistent state every time it’s observed. You shouldn’t observe a data structure collection removing an item from it while in the middle of a resize, for example. In a multithreaded program, applying the execution against an isolated portion of a functional data structure is an excellent and safe way to avoid sharing data. #### Functional lists in F# F# has a built-in implementation of an immutable list structure, which is represented as a *linked list* (a linear data structure that consists of a set of items linked together in a chain). Every programmer has written a linked list at one point. In the case of functional lists, however, the implementation requires a little more effort to guarantee the immutable behavior that the list never changes once created. Fortunately, the representation of a list in F# is simple, taking advantage of the support for algebraic data types (ADT) ([`en.wikipedia.org/wiki/Algebraic_data_type`](https://en.wikipedia.org/wiki/Algebraic_data_type)) that let you define a generic recursive `List` type. An ADT is a composite type, which means that its structure is the result of combining other types. In F#, ADTs are called *discriminated unions* (DU), and they’re a precise modeling tool to represent well-defined sets of data shapes under the same type. These different shapes are called *cases* of a DU. Think about a representation of the motor vehicle domain, where the types `Car` and `Truck` belong to the same base type `Vehicle`. DUs fit well for building complicated data structures (like linked lists and a wide range of trees) because they’re a simpler alternative to a small-object hierarchy. For example, this is a DU definition for the domain `Vehicle`: ``` type Vehicle= | Motorcycle of int | Car of int | Truck of int ``` You can think of DUs as a mechanism to provide additional semantic meaning over a type. For example, the previous DU can be read as “A Vehicle type that can be a Car, a Motorcycle, or a Truck.” The same representation in C# should use a `Vehicle` base class with derived types for `Car`, `Truck`, and `Motorcycle`. The real power of a DU is when it’s combined with pattern matching to branch to the appropriate computation, depending on the discriminated case passed. The following F# function prints the number of wheels for the vehicle passed: ``` let printWheels vehicle = match vehicle with | Car(n) -> Console.WriteLine("Car has {0} wheels", n) | Motorcycle(n) -> Console.WriteLine("Motorcycle has {0} wheels", n) | Truck(n) -> Console.WriteLine("Truck has {0} wheels", n) ``` This listing represents a recursive list, using the F# DU that satisfies the definition given in the previous section. A list can either be empty or is formed by an element and an existing list. Listing 3.13 Representation of a list in F# using discriminated unions ``` type FList<'a> = | Empty ① | Cons of head:'a * tail:FList<'a> ② let rec map f (list:FList<'a>) = ③ match list with | Empty -> Empty | Cons(hd,tl) -> Cons(f hd, map f tl) let rec filter p (list:FList<'a>) = match list with | Empty -> Empty | Cons(hd,tl) when p hd = true -> Cons(hd, filter p tl) | Cons(hd,tl) -> filter p tl ``` You can now create a new list of integers as follows: ``` let list = Cons (1, Cons (2, Cons(3, Empty))) ``` F# already has a built-in generic `List` type that lets you rewrite the previous implemented `FList` using the following two (equivalent) options: ``` let list = 1 :: 2 :: 3 :: [] let list = [1; 2; 3] ``` The F# list is implemented as a singly linked list, which provides instant access to the head of the list O(1) and linear time O(*n*) for element access, where (*n*) is the index of the item. #### Functional lists in C# You have several ways to represent a functional list in OOP. The solution adopted in C# is a generic class `FList<T>`, so it can store values of any type. This class exposes read-only auto-getter properties for defining the head element of the list and the `FList<T>` tail linked list. The `IsEmpty` property indicates if the current instance contains at least a value. The following listing shows the full implementation. Listing 3.14 Functional list in C# ``` public sealed class FList<T> { private FList(T head, FList<T> tail) ① { Head = head; Tail = tail.IsEmpty ? FList<T>.Empty : tail; IsEmpty = false; } private FList() ② { IsEmpty = true; } public T Head { get; } ③ public FList<T> Tail { get; } ④ public bool IsEmpty { get; } ⑤ public static FList<T> Cons(T head, FList<T> tail) ⑥ { return tail.IsEmpty ? new FList<T>(head, Empty) : new FList<T>(head, tail); } public FList<T> Cons(T element) ⑦ { return FList<T>.Cons(element, this); } public static readonly FList<T> Empty = new FList<T>(); ⑧ } ``` The `FList<T>` class has a private constructor to enforce its instantiation using either the static helper method `Cons` or the static field `Empty`. This last option returns an empty instance of the `FList<T>` object, which can be used to append new elements with the instance method `Cons`. Using the `FList<T>` data structure, it’s possible to create functional lists in C# as follows: ``` FList<int> list1 = FList<int>.Empty; FList<int> list2 = list1.Cons(1).Cons(2).Cons(3); FList<int> list3 = FList<int>.Cons(1, FList<int>.Empty); FList<int> list4 = list2.Cons(2).Cons(3); ``` The code sample shows a few important properties for building an `FList` of integers. The first `list1` is created from an initial state of `empty list` using the field `Empty FList<int>.Empty`, which is a common pattern used in immutable data structures. Then, with this initial state, you can use the fluent semantic approach to chain a series of `Cons` to build the collection as shown with `list2` in the code example. #### Laziness values in functional lists In chapter 2, you saw how lazy evaluation is an excellent solution to avoid excessive duplicate operations by remembering operation results. Moreover, lazily evaluated code benefits from a thread-safe implementation. This technique can be useful in the context of functional lists by deferring computations and consequently gaining in performance. In F#, lazy *thunks* (computations that have been deferred) are created using the `lazy` keyword: ``` let thunkFunction = lazy(21 * 2) ``` This listing defines a generic lazy list implementation. Listing 3.15 Lazy list implementation using F# ``` type LazyList<'a> = | Cons of head:'a * tail:Lazy<'a LazyList> ① | Empty let empty = lazy(Empty) ② let rec append items list = ③ match items with | Cons(head, Lazy(tail)) -> Cons(head, lazy(append tail list)) ④ | Empty -> list let list1 = Cons(42, lazy(Cons(21, empty))) ⑤ // val list1: LazyList<int> = Cons (42,Value is not created.) let list = append (Cons(3, empty)) list1 ⑥ // val list : LazyList<int> = Cons (3,Value is not created.) let rec iter action list = ⑦ match list with | Cons(head, Lazy(tail)) -> action(head) iter action tail | Empty -> () list |> iter (printf "%d .. ") ⑧ // 3 .. 42 .. 21 .. ``` To be more efficient in handling empty states, the lazy list implementation shifts the laziness into the tail of the `Cons` constructor, improving performance for the successive data structures. For example, the `append` operation is delayed until the head is retrieved from the list. ### 3.3.6 Building a persistent data structure: an immutable binary tree In this section, you’ll learn how to build a binary tree (B-tree) in F#, using recursion and multithreaded processes. A *tree structure* is, in layman’s terms, a collection of nodes that are connected in such a way that no cycles are allowed. A tree tends to be used where performance matters. (It’s odd that the .NET Framework never shipped with a tree in its collection namespaces.) Trees are the most common and useful data structures in computer programming and are a core concept in functional programming languages. A tree is a polymorphic recursive data structure containing an arbitrary number of trees—trees within a tree. This data structure is primarily used to organize data based on keys, which makes it an efficient tool for searches. Due to its recursive definition, trees are best used to represent hierarchical structures, such as a filesystem or a database. Moreover, trees are considered advanced data structures, which are generally used in subjects such as machine learning and compiler design. FP provides recursion as a primary constructor to iterate data structures, making it complementary for this purpose. Tree structures allow representing hierarchies and composing complex structures out of simple relationships and are used to design and implement a variety of efficient algorithms. Common uses of trees in XML/Markup are parsing, searching, compressing, sorting, image processing, social networking, machine learning, and decision trees. This last example is widely used in domains such as forecasting, finance, and gaming. The ability to express a tree in which each node may have an arbitrary number of branches, like *n*-ary and B-tree, turns out to be an impediment rather than a benefit. This section covers the B-tree, which is a self-balancing tree where every node has between zero to two child nodes at most, and the difference in depth (called height) of the tree between any leaves is at most one. *Depth of a node* is defined as the number of edges from the node to the root node. In the B-tree, each node points to two other nodes, called the left and right child nodes. A better tree definition is provided by figure 3.8, which shows key properties of the data structure.  Figure 3.8 Binary tree representation where every node has between zero and two child nodes. In this figure, node 4 is the root from which two branches start, nodes 8 and 6\. The left branch is a link to the left subtree and the right branch is a link to the right subtree. The nodes without child nodes, 6, 5, 5, and 7, are called *leaves*. A tree has a special node call *root*, which has no parent (node 4 in figure 3.8), and may be either a leaf or a node with two or more children. A parent node has at least one child, and each child has one parent. Nodes with no children are treated as leaves (nodes 6, 5, 5, 7 in the figure), and children of the same parent are known as *siblings*. #### B-trees in functional F# With F#, it’s easy to represent a tree structure because of the support of ADTs and discriminated unions. In this case, DU provides an idiomatic functional way to represent a tree. This listing shows a generic DU-based binary tree definition with a special case for empty branches. Listing 3.16 Immutable B-tree representation in F# ``` type Tree<'a> = ① | Empty ② | Node of leaf:'a * left:Tree<'a> * right:Tree<'a> ③ let tree = ④ Node (20, Node (9, Node (4, Node (2, Empty, Empty), Empty), Node (10, Empty, Empty)), Empty) ``` The elements in a B-tree are stored using the `Node` type constructor, and the `Empty` case identifier represents an empty node that doesn’t specify any type information. The `Empty` case serves as a placeholder identifier. With this B-tree definition, you can create helper functions to insert or to verify an item in the tree. These functions are implemented in idiomatic F#, using recursion and pattern matching. Listing 3.17 B-tree helper recursive functions ``` let rec contains item tree = ① match tree with | Empty -> false | Node(leaf, left, right) -> if leaf = item then true elif item < leaf then contains item left else contains item right let rec insert item tree = ① match tree with | Empty -> Node(item, Empty, Empty) | Node(leaf, left, right) as node -> if leaf = item then node elif item < leaf then Node(leaf, insert item left, right) else Node(leaf, left, insert item right) let ``exist 9`` = tree |> contains 9 let ``tree 21`` = tree |> insert 21 let ``exist 21`` = ``tree 21`` |> contains 21 ``` Because the tree is immutable, the function insert returns a new tree, with the copy of only the nodes that are in the path of the node being inserted. Traversing a DU tree in functional programming to look at all the nodes involves a recursive function. Three main approaches exist to traversing a tree: in-order, post-order, and pre-order traversal ([`en.wikipedia.org/wiki/Tree_traversal`](https://en.wikipedia.org/wiki/Tree_traversal)). For example, in the in-order tree navigation, the nodes on the left side of the root are processed first, then the root, and ultimately the nodes on its right as shown here. Listing 3.18 In-order navigation function ``` let rec inorder action tree = ① seq { match tree with | Node(leaf, left, right) -> yield! inorder action left yield action leaf yield! inorder action right | Empty -> () } tree |> inorder (printfn "%d") |> ignore ② ``` The function `inorder` takes as an argument a function to apply to each value of the tree. In the example, this function is an anonymous lambda that prints the integer stored in the tree. ## 3.4 Recursive functions: a natural way to iterate *Recursion* is calling a function on itself, a deceptively simple programming concept. Have you ever stood in between two mirrors? The reflections seem to carry on forever—this is recursion. Functional recursion is the natural way to iterate in FP because it avoids mutation of state. During each iteration, a new value is passed into the loop constructor instead to be updated (mutated). In addition, a recursive function can be composed, making your program more modular, as well as introducing opportunities to exploit parallelization. Recursive functions are expressive and provide an effective strategy to solve complex problems by breaking them into smaller, although identical, subtasks. (Think in terms of Russian nesting dolls, with each doll being identical to the one before, only smaller.) While the whole task may seem daunting to solve, the smaller tasks are easier to solve directly by applying the same function to each of them. The ability to split the task into smaller tasks that can be performed separately makes recursive algorithms candidates for parallelization. This pattern, also called Divide and Conquer*,^(8) leads to dynamic task parallelism, in which tasks are added to the computation as the iteration advances. For more information, reference the example in section 1.4.3\. Problems with recursive data structures naturally use the Divide and Conquer strategy due to its inherent potential for concurrency.* *When considering recursion, many developers fear performance penalties for the execution time of a large number of iterations, as well as receiving a `Stackoverflow` exception. The correct way to write recursive functions is using the techniques of tail recursion and CPS. Both strategies are good ways to minimize stack consumption and increase speed, as you’ll see in the examples to come. ### 3.4.1 The tail of a correct recursive function: tail-call optimization A *tail call*, also known as tail-call optimization (TCO), is a subroutine call performed as the final action of a procedure. If a tail call might lead to the same subroutine being called again in the call chain, then the subroutine is said to be *tail recursive*, a special case of recursion. *Tail-call recursion* is a technique that converts a regular recursive function into an optimized version that can handle large inputs without any risks and side effects. With tail-call recursion, there are no outstanding operations to perform within the function it returns, and the last call of the function is a call to itself. You’ll refactor the implementation of a factorial number function into a tail-call optimized function. The following listing shows the tail-call optimized recursive function implementation. Listing 3.19 Tail-call recursive implementation of a factorial in F# ``` let rec factorialTCO (n:int) (acc:int) = if n <= 1 then acc else factorialTCO (n-1) (acc * n) ① let factorial n = factorialTCO n 1 ``` In this implementation of the recursive function, the parameter `acc` acts as an accumulator. By using an accumulator and ensuring that the recursive call is the last operation in the function, the compiler optimizes the execution to reuse a single-stack frame, instead of storing each intermediate result of the recursion onto different stack frames as shown in figure 3.9.  Figure 3.9 Tail-recursive definition of a factorial, which can reuse a single stack frame The figure illustrates the tail-recursive definitions of factorials. Although F# supports tail-call recursive functions, unfortunately, the C# compiler isn’t designed to optimize tail-call recursive functions. ### 3.4.2 Continuation passing style to optimize recursive function Sometimes, optimized tail-call recursive functions aren’t the right solution or can be difficult to implement. In this case, one possible alternative approach is CPS, a technique to pass the result of a function into a continuation. CPS is used to optimize recursive functions because it avoids stack allocation. Moreover, CPS is used in the Microsoft TPL, in `async/await` in C#, and in `async-workflow` in F#. CPS plays an important role in concurrent programming. This following code example shows how the CPS pattern is used in a function `GetMaxCPS`: ``` static void GetMaxCPS(int x, int y, Action<int> action) => action(x > y ? x : y); GetMaxCPS (5, 7, n => Console.WriteLine(n)); ``` The argument for the continuation passing is defined as a delegate `Action<int>`, which can be used conveniently to pass a lambda expression. The interesting part is that the function with this design never returns a result directly; instead, it supplies the result to the continuation procedure. CPS can also be used to implement recursive functions using tail calls. #### Recursive functions with CPS At this point, with basic knowledge about CPS, you’ll refactor the factorial example from Listing 3.19 to use the CPS approach in F#. (You can find the C# implementation in the downloadable source code for this book.) Listing 3.20 Recursive implementation of factorials using CPS in F# ``` let rec factorialCPS x continuation = if x <= 1 then continuation() else factorialCPS (x - 1) (fun () -> x * continuation()) let result = factorialCPS 4 (fun () -> 1) ① ``` This function is similar to the previous implementation with the accumulator; the difference is that the function is passed instead of the accumulator variable. In this case, the action of the function `factorialCPS` applies the `continuation` function to its result. #### B-Tree structure walked in parallel recursively Listing 3.21 shows an example that iterates recursively through a tree structure to perform an action against each element. The function `WebCrawler`, from chapter 2, builds a hierarchy representation of web links from a given website. Then it scans the HTML content from each web page, looking for image links that download in parallel. The code examples from chapter 2 (listings 2.16, 2.17, 2.18, and 2.19) were intended to be an introduction to a parallel technique rather than a typical task-based parallelism procedure. Downloading any sort of data from the internet is an I/O operation; you’ll learn in chapter 8 that it’s best practice to perform I/O operations asynchronously. Listing 3.21 Parallel recursive divide-and-conquer function ``` let maxDepth = int(Math.Log(float System.Environment.ProcessorCount, ➥ 2.)+4.) ① let webSites : Tree<string> = WebCrawlerExample.WebCrawler("http://www.foxnews.com") |> Seq.fold(fun tree site -> insert site tree ) Empty ② let downloadImage (url:string) = use client = new System.Net.WebClient() let fileName = Path.GetFileName(url) client.DownloadFile(url, @"c:\Images\" + fileName) ③ let rec parallelDownloadImages tree depth = ④ match tree with | _ when depth = maxDepth -> tree |> inorder downloadImage |> ignore | Node(leaf, left, right) -> let taskLeft = Task.Run(fun() -> parallelDownloadImages left (depth + 1)) let taskRight = Task.Run(fun() -> parallelDownloadImages right (depth + 1)) let taskLeaf = Task.Run(fun() -> downloadImage leaf) Task.WaitAll([|taskLeft;taskRight;taskLeaf|]) ⑤ | Empty -> () ``` The `Task.Run` constructor is used to create and spawn the tasks. The parallel recursive function `parallelDownloadImages` takes the argument depth, which is used to limit the number of tasks created to optimize resource consumption. In every recursive call, the depth value increases by one, and when it exceeds the threshold `maxDepth`, the rest of the tree is processed sequentially. If a separate task is created for every tree node, then the overhead of creating new tasks would exceed the benefit gained from running the computations in parallel. If you have a computer with eight processors, then spawning 50 tasks will impact the performance tremendously because of the contention generated from the tasks sharing the same processors. The TPL scheduler is designed to handle large numbers of concurrent tasks, but its behavior isn’t appropriate for every case of dynamic task parallelism ([`mng.bz/ww1i`](http://mng.bz/ww1i)), and in some circumstances, as in the previous parallel recursive function, a manual tune-up is preferred. Ultimately, the `Task.WaitAll` construct is used to wait for the tasks to complete. Figure 3.10 shows the hierarchy representation of the spawned tasks running in parallel.  Figure 3.10 From the root node, Task C is created to process the right side of the subtree. This process is repeated for the subtree running Task A. When it completes, the left side of the subtree is processed by Task B. This operation is repeated for all the subtrees, and for each iteration, a new task is created. The execution time to complete the recursive parallel operation `parallelDownloadImages` has been measured against a sequential version. The benchmark is the average of downloading 50 images three times (table 3.4). Table 3.4 Benchmark of downloading 50 images using parallel recursion | **Serial** | **Parallel** | | --- | --- | | 19.71 | 4.12 | #### Parallel calculator Another interesting way to use a tree structure is building a parallel calculator. After what you’ve learned, the implementation of such a program isn’t trivial. You can use ADTs in the form of F# DUs to define the type of operations to perform: ``` type Operation = Add | Sub | Mul | Div | Pow ``` Then, the calculator can be represented as a tree structure, where each operation is a node with the details to perform a calculation: ``` type Calculator = | Value of double | Expr of Operation * Calculator * Calculator ``` Clearly, from this code, you can see the resemblance to the tree structure previously used: ``` type Tree<'a> = | Empty | Node of leaf:'a * left:Tree<'a> * right:Tree<'a> ``` The only difference is that the `Empty` case in the tree structure is replaced with the `value` case in the calculator. To perform any mathematical operation you need a value. The leaf of the tree becomes the `Operation` type, and the left and right branches recursively reference the calculator type itself, exactly as the tree did. Next, you can implement a recursive function that iterates through the calculator tree and performs the operations in parallel. This listing shows the implementation of the `eval` function and its use. Listing 3.23 Parallel calculator ``` let spawn (op:unit->double) = Task.Run(op) ① let rec eval expr = match expr with ② | Value(value) -> value ③ | Expr(op, lExpr, rExpr) -> ④ let op1 = spawn(fun () -> eval lExpr) ⑤ let op2 = spawn(fun () -> eval rExpr) ⑤ let apply = Task.WhenAll([op1;op2]) ⑥ let lRes, rRes = apply.Result.[0], apply.Result.[1] match op with ⑦ | Add -> lRes + rRes | Sub -> lRes - rRes | Mul -> lRes * rRes | Div -> lRes / rRes | Pow -> System.Math.Pow(lRes, rRes) ``` The function `eval` recursively evaluates in parallel a set of operations defined as a tree structure. During each iteration, the expression passed is pattern matched to extract the value if the case is a `Value` type, or to compute the operation if the case is an `Expr` type. Interestingly, the recursive re-evolution for each branch of the node case `Expr` is made in parallel. Each branch `Expr` returns a `value` type, which is calculated in each child (sub-nodes) operation. Then, these values are used for the last operation, which is the root of the operation tree passed as argument for the final result. Here is a simple set of operations in the shape of a calculator tree, which compute the operations 2¹⁰ / 2⁹ + 2 * 2: ``` let operations = Expr(Add, Expr(Div, Expr(Pow, Value(2.0), Value(10.0)), Expr(Pow, Value(2.0), Value(9.0))), Expr(Mul, Value(2.0), Value(2.0))) let value = eval operations ``` In this section, the code for defining a tree data structure and performing a recursive task-based function is shown in F#; but the implementation is feasible in C# as well. Rather than showing all the code here, you can download the full code from the book’s website. ## Summary * Immutable data structures use intelligent approaches, such as structural sharing, to minimize the copy-shared elements to minimize GC pressure. * It’s important to dedicate some time to profiling application performance to avoid bottlenecks and bad surprises when the program runs in production and under heavy payloads. * Lazy evaluation can be used to guarantee thread safety during the instantiation of an object and to gain performance in functional data structures by deferring computation to the last moment. * Functional recursion is the natural way to iterate in functional programming because it avoids mutation of state. In addition, a recursive function can be composed, making your program more modular. * Tail-call recursion is a technique that converts a regular recursive function into an optimized version that can handle large inputs without any risks or side effects. * Continuation passing style (CPS) is a technique to pass the result of a function into a continuation. This technique is used to optimize recursive functions because it avoids stack allocation. Moreover, CPS is used in the Task Parallel Library in .NET 4.0, in `async/await` in C#, and in `async-workflow` in F#. * Recursive functions are great candidates to implement a Divide and Conquer technique, which leads to dynamic task parallelism.********
第二部分
如何处理并发程序的不同部分
本书这部分深入探讨函数式编程的概念和适用性。我们将探讨各种并发编程模型,重点介绍这种范式的优势和好处。主题将包括任务并行库以及并行模式,如 Fork/Join、分而治之、MapReduce。我们还将讨论声明式组合、异步操作的高级抽象、代理编程模型以及消息传递语义。你将亲身体验到函数式编程如何让你在不评估它们的情况下组合程序元素。这些技术并行化工作,使程序更容易推理,运行效率更高,因为它们优化了内存消耗。
4
处理大数据的基本原理:数据并行,第一部分
本章涵盖
-
在大数据的世界中数据并行的重要性
-
应用 Fork/Join 模式
-
编写声明式并行程序
-
理解并行
for循环的限制 -
通过数据并行提高性能
想象一下,你为晚餐准备四人的意大利面,假设准备和上桌需要 10 分钟。你开始准备时,往一个中等大小的锅中加水煮沸。然后,又有两个朋友到你家吃饭。显然,你需要做更多的意大利面。你可以换一个大锅的水,里面放更多的意大利面,但这会花费更长的时间来煮,或者你可以使用第二个锅与第一个锅一起使用,这样两个锅的意大利面就可以同时完成烹饪。数据并行的工作方式与此类似。如果“同时烹饪”,可以处理大量的数据。
在过去十年中,生成数据的量呈指数增长。2017 年估计,每分钟有 4,750,000 个 Facebook“赞”,近 40 万个推文,超过 250 万个 Instagram 帖子,以及超过 400 万个谷歌搜索。这些数字每年以 15%的速度持续增长。这种加速影响了现在必须快速分析大量大数据的企业(en.wikipedia.org/wiki/Big_data)。如何在保持快速响应的同时分析如此大量的数据?答案是来自一种新的技术,这种技术旨在考虑数据并行,特别是关注在数据持续增加的情况下保持性能的能力。
在本章中,你将学习概念、设计模式和技巧,以快速处理大量数据。你将分析来自并行循环结构的难题,并了解解决方案。你还将了解到,通过将函数式编程与数据并行结合使用,可以在最小代码更改的情况下实现算法性能的显著提升。
4.1 什么是数据并行?
数据 并行性 是一种编程模型,它并行地对大量数据进行相同的操作集。这种编程模型正在获得越来越多的关注,因为它能够快速处理各种大数据问题中的大量数据。并行性可以在不要求重新组织其结构的情况下计算算法,从而逐步提高可伸缩性。
数据并行性的两种模型是单指令单数据 (SISD) 和单指令多数据 (SIMD):
-
单指令单数据 (SISD) 用于定义单核架构。单核处理器系统在每个 CPU 时钟周期执行一个任务;因此,执行是顺序和确定的。它接收一条指令(单指令),执行单个数据所需的工作,并返回计算的结果。这种处理器架构将不会在本书中介绍。
-
单指令多数据 (SIMD) 是一种通过在可用的多个核心之间分配数据并应用任何给定 CPU 时钟周期上的相同操作来实现的并行形式。这种类型的并行、多核 CPU 架构通常用于利用数据并行性。
要实现数据并行性,数据被分割成块,每个块都受到密集的计算并独立处理,无论是为了产生新的数据以聚合还是减少到标量值。如果你不熟悉这些术语,它们应该在章节结束时变得清晰。
独立计算数据块的能力是实现显著性能提升的关键,因为消除数据块之间的依赖关系消除了访问数据同步的需要,以及任何关于竞争条件的担忧,如图 4.1 所示。

图 4.1 通过将数据集分割成块并独立并行处理每个分区,实现了数据并行性。将每个块分配给单独的任务。当任务完成时,数据集被重新组装。在这个图中,左侧的数据集由多个任务使用锁来同步对整个数据的访问进行处理。在这种情况下,同步是线程之间竞争的来源,并产生了性能开销。右侧的数据集被分割成六个部分,每个任务针对数据集总大小 N 的六分之一进行处理。这种设计消除了使用锁来同步的必要性。
数据并行性可以通过在分布式系统中通过多个节点分配工作来实现;或者在一个单机中;或者通过将工作分割成独立的线程。本章重点介绍实现和使用多核硬件来执行数据并行性。
4.1.1 数据和任务并行性
数据并行性的目标是分解给定的数据集,并生成足够多的任务以最大化 CPU 资源的使用。此外,每个任务应安排足够的计算操作,以确保更快的执行时间。这与上下文切换形成对比,上下文切换可能会引入负开销。
数据并行性有两种形式:
-
任务并行性旨在在多个处理器上执行计算机程序,其中每个线程负责在同一时间执行不同的操作。这是在多个核心上对相同或不同的数据集执行许多不同函数的同时执行。
-
数据并行性旨在将给定的数据集分配到多个任务中的较小分区,其中每个任务并行执行相同的指令。例如,数据并行性可能指图像处理算法,其中每个图像或像素由独立任务并行更新。相反,任务并行性将并行计算一系列图像,对每个图像应用不同的操作。参见图 4.2。

图 4.2 数据并行性是在数据集的元素上同时执行相同的函数。任务并行性是在相同或不同的数据集上同时执行多个不同的函数。
简而言之,任务并行性侧重于执行多个函数(任务),并旨在通过同时运行这些任务来减少整体计算时间。数据并行性通过将相同的算法计算分配给多个 CPU 并行执行,从而减少处理数据集所需的时间。
4.1.2 “令人尴尬的并行”概念
在数据并行性中,应用于处理数据的算法有时被称为“令人尴尬的并行”,并具有自然可扩展的特殊属性。1 这个属性会影响算法中的并行程度,随着可用硬件线程数量的增加而增加。在更强大的计算机上,算法将运行得更快。在数据并行性中,算法应设计为在关联于硬件核心的单独任务中独立运行每个操作。这种结构具有自动在运行时调整工作负载并根据当前计算机调整数据分区的优势。这种行为保证了程序在所有可用核心上运行。
考虑对大量数字数组进行求和。数组的任何部分都可以独立于其他部分进行求和。然后,这些部分和可以相互求和,达到与数组按顺序求和相同的结果。部分和是在同一处理器上还是在同一时间计算并不重要。具有高度独立性的此类算法被称为令人尴尬的并行问题:你投入的处理器越多,它们运行得越快。在第三章中,你看到了提供自然并行性的分而治之模式。它将工作分配给众多任务,然后再次组合(减少)结果。其他令人尴尬的并行设计不需要复杂的协调机制来提供自然的自扩展性。使用这种方法的设计模式示例包括 Fork/Join、并行聚合(减少)和 MapReduce。我们将在本章后面讨论这些设计。
4.1.3 .NET 中的数据并行支持
在你的程序中识别可以并行化的代码不是一项简单任务,但常见的规则和实践可以帮助。首先要做的是对应用程序进行性能分析。这种程序分析确定了代码在哪里花费时间,这是你应该开始更深入调查以改进性能和检测并行机会的线索。作为一个指导,并行机会是在两个或多个源代码部分可以在不改变程序输出的情况下并行执行时。或者,如果引入并行性会改变程序的输出,则程序不是确定的,可能会变得不可靠;因此,并行性不可用。
为了确保并行程序中的确定性结果,同时运行的源代码块之间不应存在依赖关系。实际上,当没有依赖关系或现有依赖关系可以被消除时,程序可以很容易地进行并行化。例如,在分而治之模式中,函数的递归执行之间没有依赖关系,因此可以实现并行化。
大数据集是并行化的一个主要候选者,其中可以在每个元素上独立执行 CPU 密集型操作。一般来说,任何形式的循环(for循环、while循环和for-each循环)都是利用并行性的优秀候选者。使用微软的 TPL,将顺序循环重塑为并行循环是一项简单的任务。这个库提供了一层抽象,简化了数据并行中涉及到的常见可并行化模式的实现。这些模式可以使用 TPL Parallel类提供的Parallel.For和Parallel.ForEach并行构造来具体化。
下面是一些在提供并行机会的程序中发现的模式:
-
顺序循环,其中迭代步骤之间没有依赖关系。
-
减少和/或聚合操作,其中步骤之间的计算结果部分合并。此模型可以使用 MapReduce 模式表示。
-
计算单位,其中显式依赖可以被转换为 Fork/Join 模式以并行运行每个步骤。
-
使用分而治之方法的递归算法类型,其中每个迭代可以在不同的线程中独立并行执行。
在 .NET 框架中,数据并行性也通过 PLINQ 支持,我推荐使用它。查询语言为数据并行性提供了一个更声明性的模型,与 Parallel 类相比,并且用于对数据源进行任意查询的并行评估。声明性意味着你想要对数据进行什么操作,而不是如何操作。内部,TPL 使用复杂的调度算法来有效地在可用的处理核心之间分配并行化计算。C# 和 F# 都以类似的方式利用这些技术。在下一节中,你将看到这两种编程语言中的这些技术,它们可以很好地混合和补充。
4.2 Fork/Join 模式:并行 Mandelbrot
理解如何将顺序程序转换为并行程序的最佳方式是通过示例。在本节中,我们将使用 Fork/Join 模式转换程序以利用并行计算并实现更快的性能。
在 Fork/Join 模式中,单个线程分支并与多个独立的并行工作者协调,然后在完成时合并个别结果。Fork/Join 并行性体现在两个主要步骤中:
-
将给定的任务分割成一组子任务,这些子任务被安排独立并行运行。
-
等待分支的并行操作完成,然后依次将子任务结果合并到原始工作中。
关于数据并行性,图 4.3 显示与 图 4.1 非常相似。区别在于最后一步,此时 Fork/Join 模式将结果合并回一个整体。

图 4.3 Fork/Join 模式将任务分割成可以独立并行执行的子任务。当操作完成时,子任务再次合并。这种模式经常用于实现数据并行性并非巧合。显然存在相似之处。
正如您所看到的,这种模式非常适合数据并行。Fork/Join 模式通过将工作划分为块(fork)并在并行中单独运行每个块来加速程序的执行。在每个并行操作完成后,这些块再次合并(join)。一般来说,Fork/Join 是一种编码结构化并行的优秀模式,因为 fork 和 join 是同时发生的(相对于调用者是同步的),但并行(从性能和速度的角度来看)。可以使用 .NET Parallel 类中的 Parallel.For 循环轻松实现 Fork/Join 抽象。这个静态方法透明地处理数据的划分和任务的执行。
Let’s analyze the `Parallel.For` loop construct with an example. First, you implement a sequential `for` loop to draw a Mandelbrot image (see figure 4.4), and then the code will be refactored to run faster. We’ll evaluate the pros and cons of the approach.  Figure 4.4 The Mandelbrot drawing resulting from running the code in this section For this example, the details of implementing the algorithm aren’t important. What’s important is that for each pixel in the picture (image), a computation runs for each assigned color. This computation is independent because each pixel color doesn’t depend on other pixel colors, and the assignment can be done in parallel. In fact, each pixel can have a different color assigned regardless of the color of the other pixels in the image. The absence of dependencies affects the execution strategy; each computation can run in parallel. In this context, the Mandelbrot algorithm is used to draw an image representing the magnitude value of the complex number. The natural representation of this program uses a `for` loop to iterate through each value of the Cartesian plane to assign the corresponding color for each point. The Mandelbrot algorithm decides the color. Before delving into the core implementation, you need an object for the complex number. The following listing shows a simple implementation of a complex number used to make operations over other imaginary complex numbers. Listing 4.1 Complex number object ``` class Complex { public Complex(float real, float imaginary) { Real = real; Imaginary = imaginary; } public float Imaginary { get; } ① public float Real { get; } ① public float Magnitude => (float)Math.Sqrt(Real * Real + Imaginary * Imaginary); ② public static Complex operator +(Complex c1, Complex c2) => new Complex(c1.Real + c2.Real, c1.Imaginary + c2.Imaginary); ③ public static Complex operator *(Complex c1, Complex c2) => new Complex(c1.Real * c2.Real - c1.Imaginary * c2.Imaginary, c1.Real * c2.Imaginary + c1.Imaginary * c2.Real); ③ } ``` The `Complex` class contains a definition for the `Magnitude` property. The interesting part of this code is the two overloaded operators for the `Complex` object. These operators are used to add and multiply a complex number, which is used in the Mandelbrot algorithm. The following listing shows the two core functions of the Mandelbrot algorithm. The function `isMandelbrot` determines if the complex number belongs to the Mandelbrot set. Listing 4.2 Sequential Mandelbrot ``` Func<Complex, int, bool> isMandelbrot = (complex, iterations) => ① { var z = new Complex(0.0f, 0.0f); int acc = 0; while (acc < iterations && z.Magnitude < 2.0) { z = z * z + complex; acc += 1; } return acc == iterations; }; **for (int col = 0; col < Cols; col++)** { ② for (int row = 0; row < Rows; row++) { ② var x = ComputeRow(row); ③ var y = ComputeColumn(col); ③ var c = new Complex(x, y); var color = isMandelbrot(c, 100) ? Color.Black : Color.White; ④ var offset = (col * bitmapData.Stride) + (3 * row); pixels[offset + 0] = color.B; // Blue component ⑤ pixels[offset + 1] = color.G; // Green component ⑤ pixels[offset + 2] = color.R; // Red component ⑤ } } ``` The code omits details regarding the bitmap generation, which isn’t relevant for the purpose of the example. You can find the full solution in the downloadable source code online. In this example, there are two loops: the outer loop iterates over the columns of the picture box, and the inner loop iterates over its rows. Each iteration uses the functions `ComputeColumn` and `ComputeRow`, respectively, to convert the current pixel coordinates into the real and imaginary parts of a complex number. Then, the function `isMandelbrot` evaluates if the complex number belongs to the Mandelbrot set. This function takes as arguments a complex number and a number of iterations, and it returns a Boolean if—or not—the complex number is a member of the Mandelbrot set. The function body contains a loop that accumulates a value and decrements a count. The value returned is a Boolean that’s true if the accumulator `acc` equals the iterations count. In the code implementation, the program requires 3.666 seconds to evaluate the function `isMandelbrot` 1 million times, which is the number of pixels composing the Mandelbrot image. A faster solution is to run the loop in the Mandelbrot algorithm in parallel. As mentioned earlier, the TPL provides constructs that can be used to blindly parallelize programs, which results in incredible performance improvements. In this example, the higher-order `Parallel.For` function is used as a drop-in replacement for the sequential loop. This listing shows the parallel transformation with minimal changes, keeping the sequential structure of the code. Listing 4.3 Parallel Mandelbrot ``` Func<Complex, int, bool> isMandelbrot = (complex, iterations) => { var z = new Complex(0.0f, 0.0f); int acc = 0; while (acc < iterations && z.Magnitude < 2.0) { z = z * z + complex; acc += 1; } return acc == iterations; }; System.Threading.Tasks.Parallel.For(0, Cols - 1, col => { ① for (int row = 0; row < Rows; row++) { var x = ComputeRow(row); var y = ComputeColumn(col); var c = new Complex(x, y); var color = isMandelbrot(c, 100) ? Color.DarkBlue : Color.White; var offset = (col * bitmapData.Stride) + (3 * row); pixels[offset + 0] = color.B; // Blue component pixels[offset + 1] = color.G; // Green component pixels[offset + 2] = color.R; // Red component } }); ``` Note that only the outer loop is paralleled to prevent oversaturation of the cores with work items. With a simple change, the execution time decreased to 0.699 seconds in a quad-core machine. Oversaturation is a form of extra overhead, originating in parallel programming, when the number of threads created and managed by the scheduler to perform a computation grossly exceeds the available hardware cores. In this case, parallelism could make the application slower than the sequential implementation. As a rule of thumb, I recommend that you parallelize expensive operations at the highest level. For example, figure 4.5 shows nested `for` loops; I suggest you apply parallelism only to the outer loop.  Figure 4.5 Using a `Parallel.For` construct, this benchmark compares the execution time of the sequential loop, which runs in 9.038 seconds, against the parallel, which runs in 3.443 seconds. The `Parallel.For` loop is about three times faster than the sequential code. Moreover, the last column on the right is the execution time of the over-saturated parallel loop, where both outer and inner loops are using the `Parallel.For` construct. The over-saturated parallel loop runs in 5.788 seconds, which is 50% slower than the non-saturated version. In general, the optimal number of worker threads for a parallel task should be equal to the number of available hardware cores divided by the average fraction of core utilization per task. For example, in a quad-core computer with 50% average core utilization per task, the perfect number of worker threads for maximum throughput is eight: (4 cores × (100% max CPU utilization / 50% average core utilization per task)). Any number of worker threads above this value could introduce extra overhead due to additional context switching, which would downgrade the performance and processor utilization. ### 4.2.1 When the GC is the bottleneck: structs vs. class objects The goal of the Mandelbrot example is to transform a sequential algorithm into a faster one. No doubt you’ve achieved a speed improvement; 9.038 to 3.443 seconds is a little more than three times faster on a quad-core machine. Is it possible to further optimize performance? The TPL scheduler is partitioning the image and assigning the work to different tasks automatically, so how can you improve the speed? In this case, the optimization involves reducing memory consumption, specifically by minimizing memory allocation to optimize garbage collection. When the GC runs, the execution of the program stops until the garbage collection operation completes. In the Mandelbrot example, a new `Complex` object is created in each iteration to decide if the pixel coordinate belongs to the Mandelbrot set. The `Complex` object is a reference type, which means that new instances of this object are allocated on the heap. This piling of objects onto the heap results in memory overhead, which forces the GC to intervene to free space. A reference object, as compared to a value type, has extra memory overhead due to the pointer size required to access the memory location of the object allocated in the heap. Instances of a class are always allocated on the heap and accessed via a pointer dereference. Therefore, passing reference objects around, because they’re a copy of the pointer, is cheap in terms of memory allocation: around 4 or 8 bytes, according the hardware architecture. Additionally, it’s important to keep in mind that an object also has a fixed overhead of 8 bytes for 32-bit processes and 16 bytes for 64-bit processes. In comparison, a value type isn’t allocated in the heap but rather in the stack, which removes any overhead memory allocation and garbage collection. Keep in mind, if a value type (struct) is declared as a local variable in a method, it’s allocated on the stack. Instead, if the value type is declared as part of a reference type (class), then the struct allocation becomes part of that object memory layout and exists on the heap. The Mandelbrot algorithm creates and destroys 1 million `Complex` objects in the `for` loop; this high rate of allocation creates significant work for the GC. By replacing the `Complex` object from reference to value type, the speed of execution should increase because allocating a struct to the stack will never cause the GC to perform cleanup operations and won’t result in a program pause. In fact, when passing a `value` type to a method, it’s copied byte for byte, therefore allocating a struct that will never cause garbage collection because it isn’t on the heap. The optimization of converting the `Complex` object from reference to value type is simple, requiring only that you change the keyword `class` to `struct` as shown next. (The full implementation of the `Complex` object is intentionally omitted.) The `struct` keyword converts a reference type (class) to a value type: | ``` **class** `Complex {` `public Complex(float real,` `float imaginary)` `{` `this.Real = real;` `this.Imaginary = imaginary;` `}` ``` | ``` **struct** `Complex {` `public Complex(float real,` `float imaginary)` `{` `this.Real = real;` `this.Imaginary = imaginary;` `}` ``` | After this simple code change, the execution time to draw the Mandelbrot algorithm has increased the speed approximately 20%, as shown in figure 4.6.  Figure 4.6 The `Parallel.For` construct benchmark comparison of the Mandelbrot algorithm computed in a quad-core machine with 8 GB of RAM. The sequential code runs in 9.009 seconds, compared to the parallel version, which runs in 3.131 seconds—almost three times faster. In the right column, the better performance is achieved by the parallel version of the code that uses the value type as a complex number in place of the reference type. This code runs in 2.548 seconds, 20% faster than the original parallel code, because there are no GC generations involved during its execution to slow the process. The real improvement is the number of GC generations to free memory, which is reduced to zero using the struct type instead of the class reference type.^(2) Table 4.1 shows GC generation comparison between a `Parallel.For` loop using many reference types (class) and a `Parallel.For` loop using many value types (struct). Table 4.1 GC generations comparison | **Operation** | **GC gen0** | **GC gen1** | **GC gen2** | | --- | --- | --- | --- | | `Parallel.For` | 1390 | 1 | 1 | | `Parallel.For with struct value type` | 0 | 0 | 0 | The version of the code that runs using the `Complex` object as a reference type makes many short-lived allocations to the heap: more than 4 million.^(3) A short-lived object is stored in the first GC generation, and it’s scheduled to be removed from the memory sooner than generations 1 and 2\. This high rate of allocation forces the GC to run, which involves stopping all the threads that are running, except the threads needed for the GC. The interrupted tasks resume only after the GC operation completes. Clearly, the smaller the number of the GC generations, the faster the application performs. ### 4.2.2 The downside of parallel loops In the previous section, you ran both the sequential and parallel versions of the Mandelbrot algorithm to compare performance. The parallel code was implemented using the TPL `Parallel` class and a `Parallel.For` construct, which can provide significant performance improvements over ordinary sequential loops. In general, the parallel `for` loop pattern is useful to perform an operation that can be executed independently for each element of a collection (where the elements don’t rely on each other). For example, mutable arrays fit perfectly in parallel loops because every element is located in a different location in memory, and the update can be effected in place without race conditions. The work of parallelizing the loop introduces complexity that can lead to problems that aren’t common or even encountered in sequential code. For example, in sequential code, it’s common to have a variable that plays the role of accumulator to read from or write to. If you try to parallelize a loop that uses an accumulator, you have a high probability of encountering a race condition problem because multiple threads are concurrently accessing the variables. In a parallel `for` loop, by default, the degree of parallelism depends on the number of available cores. The *degree of parallelism* refers to the number of iterations that can be run at the same time in a computer. In general, the higher the number of available cores, the faster the parallel `for` loop executes. This is true until the point of diminishing return that Amdahl’s Law (the speed of a parallel loop depends on the kind of work it does) predicts is reached. ## 4.3 Measuring performance speed Achieving an increase in performance is without a doubt the main reason for writing parallel code. *Speedup* refers to the performance gained from executing a program in parallel on a multicore computer as compared to a single-core computer. A few different aspects should be considered when evaluating speedup. The common way to gain speedup is by dividing the work between the available cores. In this way, when running one task per processor with *n* cores, the expectation is to run the program *n* times faster than the original program. This result is called *linear speedup*, which in the real world is improbable to reach due to overhead introduced by thread creation and coordination. This overhead is amplified in the case of parallelism, which involves the creation and partition of multiple threads. To measure the speedup of an application, the single-core benchmark is considered the baseline. The formula to calculate the linear speedup of a sequential program ported into a parallel version is *speedup = sequential time / parallel time*. For example, assuming the execution time of an application running in a single-core machine is 60 minutes, when the application runs on a two-core computer, the time decreases to 40 minutes. In this case, the speedup is 1.5 (60 / 40). Why didn’t the execution time drop to 30 minutes? Because parallelizing the application involves the introduction of some overhead, which prevents the linear speedup according to the number of cores. This overhead is due to the creation of new threads, which implicate contention, context switching, and thread scheduling. Measuring performance and anticipating speedup is fundamental for the benchmarking, designing, and implementing of parallel programs. For that reason, parallelism execution is an expensive luxury—it isn’t free but instead requires an investment of time in planning. Inherent overhead costs are related to the creation and coordination of threads. Sometimes, if the amount of work is too small, the overhead brought in parallelism can exceed the benefit and, therefore, overshadow the performance gain. Frequently, the scope and volume of a problem affect the code design and the time required to execute it. Sometimes, better performance is achievable by approaching the same problem with a different, more scalable solution. Another tool to calculate whether the investment is worth the return is Amdahl’s Law, a popular formula for calculating the speedup of a parallel program. ### 4.3.1 Amdahl’s Law defines the limit of performance improvement At this point, it’s clear that to increase the performance of your program and reduce the overall execution time of your code, it’s necessary to take advantage of parallel programming and the multicore resources available. Almost every program has a portion of the code that must run sequentially to coordinate parallel execution. As in the Mandelbrot example, rendering the image is a sequential process. Another common example is the Fork/Join pattern, which starts the execution of multiple threads in parallel and then waits for them to complete before continuing the flow. In 1965, Gene Amdahl concluded that the presence of sequential code in a program jeopardizes overall performance improvement. This concept counters the idea of linear speedup. A linear speedup means that the time *T* (units of time) it takes to execute a problem with *p* processors is *T*/*p* (the time it takes to execute a problem with one processor). This can be explained by the fact that programs cannot run entirely in parallel; therefore, the increase of performance expected isn’t linear and is limited by the sequential (serial) code constraint. Amdahl’s Law says that, given a fixed data-set size, the maximum performance increase of a program implemented using parallelism is limited by the time needed for the sequential portion of the program. According to Amdahl’s Law, no matter how many cores are involved in the parallel computation, the maximum speedup the program can achieve depends on the percent of time spent in sequential processing. Amdahl’s Law determines the speedup of a parallel program by using three variables: * Baseline duration of the program executed in a single-core computer * The number of available cores * The percentage of parallel code Here’s the formula to calculate the speedup according with Amdahl’s Law: *Speedup* = 1 / (1 – *P* + (*P* / *N*)) The numerator of the equation is always 1 because it represents the base duration. In the denominator, the variable *N* is the number of available cores, and *P* represents the percentage of parallel code. For example, if the parallelizable code is 70% in a quad-core machine, the maximum expected speedup is 2.13: **Speedup* = 1 / (1 – .70 + (.70 / 4)) = 1 / (.30 + .17) = 1 / 0.47 = 2.13 times A few conditions may discredit the result of this formula. For the one related to data parallelism, with the onset of big data, the portion of the code that runs in parallel for processing data analysis has more effect on performance as a whole. A more precise formula to calculate performance improvement due to parallelism is Gustafson’s Law. ### 4.3.2 Gustafson’s Law: a step further to measure performance improvement Gustafson’s Law is considered the evolution of Amdahl’s Law and examines the speedup gain by a different and more contemporary perspective—considering the increased number of cores available and the increasing volume of data to process. Gustafson’s Law considers the variables that are missing in Amdahl’s Law for the performance improvement calculation, making this formula more realistic for modern scenarios, such as the increase of parallel processing due to multicore hardware. The amount of data to process is growing exponentially each year, thereby influencing software development toward parallelism, distributed systems, and cloud computing. Today, this is an important factor that invalidates Amdahl’s Law and legitimizes Gustafson’s Law. Here’s the formula for calculating the speedup according to Gustafson’s Law: *Speedup* = *S* + (*N* × *P*) *S* represents the sequential units of work, *P* defines the number of units of work that can be executed in parallel, and *N* is the number of available cores. A final explanation: Amdahl’s Law predicts the speedup achievable by parallelizing sequential code, but Gustafson’s Law calculates the speedup reachable from an existing parallel program. ### 4.3.3 The limitations of parallel loops: the sum of prime numbers This section covers some of the limitations resulting from the sequential semantics of the parallel loop and techniques to overcome these disadvantages. Let’s first consider a simple example that parallelizes the sum of the prime numbers in a collection. Listing 4.4 calculates the sum of the prime numbers of a collection with 1 million items. This calculation is a perfect candidate for parallelism because each iteration performs the same operation exactly. The implementation of the code skips the sequential version, whose execution time to perform the calculation runs in 6.551 seconds. This value will be used as a baseline to compare the speed with the parallel version of the code. Listing 4.4 Parallel sum of prime numbers using a `Parallel.For` loop construct ``` int len = 10000000; long total = 0; ① Func<int, bool> isPrime = n => { ② if (n == 1) return false; if (n == 2) return true; var boundary = (int)Math.Floor(Math.Sqrt(n)); for (int i = 2; i <= boundary; ++i) if (n % i == 0) return false; return true; }; Parallel.For(0, len, i => { ③ if (isPrime(i)) ④ total += i; ④ }); ``` The function `isPrime` is a simple implementation used to verify whether a given number is prime. The `for` loop uses the `total` variable as the accumulator to sum all the prime numbers in the collection. The execution time to run the code is 1.049 seconds in a quad-core computer. The speed of the parallel code is six times faster as compared with the sequential code. Perfect! But, not so fast. If you run the code again, you’ll get a different value for the `total` accumulator. The code isn’t deterministic, so every time the code runs, the output will be different, because the accumulator variable `total` is shared among different threads. One easy solution is to use a lock to synchronize the access of the threads to the `total` variable, but the cost of synchronization in this solution hurts performance. A better solution is to use a `ThreadLocal<T>` variable to store the thread’s local state during loop execution. Fortunately, `Parallel.For` offers an overload that provides a built-in construct for instantiating a thread-local object. Each thread has its own instance of `ThreadLocal`, removing any opportunity for negative sharing of state. The `ThreadLocal<T>` type is part of the `System.Threading` namespace as shown in bold here. Listing 4.5 Using `Parallel.For` with `ThreadLocal` variables ``` **Parallel.For**(0, len, () => 0, ① (int i, ParallelLoopState loopState, long tlsValue) => { ② return isPrime(i) ? tlsValue += i : tlsValue; }, value => Interlocked.Add(ref total, value)); ③ ``` The code still uses a global mutual variable `total`, but in a different way. In this version of the code, the third parameter of the `Parallel.For` loop initializes a local state whose lifetime corresponds to the first iteration on the current thread through the last one. In this way, each thread uses a thread-local variable to operate against an isolated copy of state, which can be stored and retrieved separately in a thread-safe manner. When a piece of data is stored in a managed *thread-local storage* (TLS), as in the example, it’s unique to a thread. In this case, the thread is called the *owner* of the data. The purpose of using thread-local data storage is to avoid the overhead due to lock synchronizations accessing a shared state. In the example, a copy of the local variable `tlsValue` is assigned and used by each thread to calculate the sum of a given range of the collection that has been partitioned by the parallel partitioner algorithm. The parallel partitioner uses a sophisticated algorithm to decide the best approach to divide and distribute the chunks of the collection between threads. After a thread completes all of the iterations, the last parameter in the `Parallel.For` loop that defines the `join` operation is called. Then, during the `join` operation, the results from each thread are aggregated. This step uses the `Interlocked` class for high performance and thread-safe operation of addition operations. This class was introduced in chapter 3 to perform CAS operations to safely mutate (actually swap) the value of an object in multithreaded environments. The `Interlock` class provides other useful operations, such as increment, decrement, and exchange of variables. This section has mentioned an important term in data parallelism: aggregate. The aggregate concept will be covered in chapter 5. Listing 4.5, the final version of the code, produces a deterministic result with a speed of execution of *1.178 seconds* : almost equivalent to the previous one. You pay a little extra overhead in exchange for correctness. When using shared state in a parallel loop, scalability is often lost because of synchronization on the shared state access. ### 4.3.4 What can possibly go wrong with a simple loop? Now we consider a simple code block that sums the integers from a given array. Using any OOP language, you could write something like this. Listing 4.6 Common `for` loop ``` int sum = 0; for (int i = 0; i < data.Length; i++) { sum += data[i]; ① } ``` You’ve written something similar in your career as a programmer; likely, a few years ago, when programs were executed single-threaded. Back then this code was fine, but these days, you’re dealing with different scenarios and with complex systems and programs that simultaneously perform multiple tasks. With these challenges, the previous code can have a subtle bug, in the `sum` line of code: ``` sum += data[i]; ``` What happens if the values of the array are mutated while it’s traversed? In a multithreaded program, this code presents the issue of mutability, and it cannot guarantee consistency. Note that not all state mutation is equally evil, if the mutation of state that’s only visible within the scope of a function is inelegant, but harmless. For example, suppose the previous sum in a `for` loop is isolated in a function as follows: ``` int Sum(int[] data) { int sum = 0; for (int i = 0; i < data.Length; i++) { sum += data[i]; } } ``` Despite updating the `sum` value, its mutation isn’t visible from outside the scope of the function. As a result, this implementation of `sum` can be considered a pure function. To reduce complexity and errors in your program, you must raise the level of abstraction in the code. For example, to compute a sum of numeric values, express your intention in “what you want,” without repeating “how to do.” Common functionality should be part of the language, so you can express your intentions as ``` int sum = data.Sum(); ``` Indeed, the `Sum` extension method ([`mng.bz/f3nF`](http://mng.bz/f3nF)) is part of the `System.Linq` namespace in .NET. In this namespace, many methods, such as `List` and `Array`, extend the functionality for any `IEnumerable` object ([`mng.bz/2bBv`](http://mng.bz/2bBv)). It’s not a coincidence that the ideas behind LINQ originate from functional concepts. The LINQ namespace promotes immutability, and it operates on the concept of transformation instead of mutation, where a LINQ query (and lambda) let you transform a set of structured data from its original form into another form, without worrying about side effects or state. ### 4.3.5 The declarative parallel programming model In the sum of prime numbers example in Listing 4.5, the `Parallel.For` loop constructor definitely fits the purpose of speedup compared to the sequential code and does it efficiently, although the implementation is a bit more difficult to understand and maintain compared to the sequential version. The final code isn’t immediately clear to a developer looking at it for the first time. Ultimately, the intent of the code is to sum the prime numbers of a collection. It would be nice to have the ability to express the intentions of the program, defining step by step how to implement the algorithm. This is where PLINQ comes into play. The following listing is the equivalent of the parallel `Sum` using PLINQ (in bold) in place of the `Parallel.For` loop. Listing 4.7 Parallel sum of a collection using declarative PLINQ ``` long total = 0; Parallel.For(0, len, ① () => 0, (int i, ParallelLoopState loopState, long tlsValue) => { return isPrime(i) ? tlsValue += i : tlsValue; }, value => Interlocked.Add(ref total, value)); **long total = Enumerable.Range(0, len).AsParallel()** ② **.Where(isPrime).Sum(x => (long)x);** ③ ``` The functional declarative approach is only one line of code. Clearly, when compared to the `for` loop implementation, it’s simple to understand, succinct, maintainable, and without any mutation of state. The PLINQ construct represents the code as a chain of functions, each one providing a small piece of functionality to accomplish the task. The solution adopts the higher-order-function aggregate part of the LINQ/PLINQ API, which in this case is the function `Sum()`. The aggregate applies a function to each successive element of a collection, providing the aggregated result of all previous elements. Other common aggregate functions are `Average()`, `Max()`, `Min()`, and `Count()`. Figure 4.7 shows benchmarks comparing the execution time of the parallel `Sum`.  Figure 4.7 Benchmarking comparison of the sum of prime numbers. The benchmark runs in an eight-core machine with 8 GB of RAM. The sequential version runs in 8.072 seconds. This value is used as a base for the other versions of the code. The `Parallel.For` version took 1.814 seconds, which is approximately 4.5 times faster than the sequential code. The `Parallel.For ThreadLocal` version is a little faster than the parallel `Loop`. Ultimately, the PLINQ program is slowest among the parallel versions; it took 1.824 seconds to run. **The function `Aggregate` will be covered in detail in chapter 5\. ## Summary * Data parallelism aims to process massive amounts of data by partitioning and performing each chunk separately, then regrouping the results when completed. This lets you analyze the chunks in parallel, gaining speed and performance. * Mental models used in this chapter, which apply to data parallelism, are Fork/Join, Parallel Data Reduction, and Parallel Aggregation. These design patterns share a common approach that separates the data and runs the same task in parallel on each divided portion. * Utilizing functional programming constructs, it’s possible to write sophisticated code to process and analyze data in a declarative and simple manner. This paradigm lets you achieve parallelism with little change in your code. * Profiling the program is a way to understand and ensure that the changes you make to adopt parallelism in your code are beneficial. To do that, measure the speed of the program running sequentially, then use a benchmark as a baseline to compare the code changes.*** ***# 5 PLINQ and MapReduce: data parallelism, part 2 **This chapter covers** * Using declarative programming semantics * Isolating and controlling side effects * Implementing and using a parallel `Reduce` function * Maximizing hardware resource utilization * Implementing a reusable parallel MapReduce pattern This chapter presents MapReduce, one of the most widely used functional programming patterns in software engineering. Before delving into MapReduce, we’ll analyze the declarative programming style that the functional paradigm emphasizes and enforces, using PLINQ and the idiomatic F#, `PSeq`. Both technologies analyze a query statement at runtime and make the best strategy decision concerning how to execute the query in accordance with available system resources. Consequently, the more CPU power added to the computer, the faster your code will run. Using these strategies, you can develop code ready for next-generation computers. Next, you’ll learn how to implement a parallel `Reduce` function in .NET, which you can reuse in your daily work to increase the speed of execution of aggregates functions. Using FP, you can engage data parallelism in your programs without introducing complexity, compared to conventional programming. FP prefers declarative over procedural semantics to express the intent of a program instead of describing the steps to achieve the task. This declarative programming style simplifies the adoption of parallelism in your code. ## 5.1 A short introduction to PLINQ Before we delve into PLINQ, we’ll define its sequential double, LINQ, an extension to the .NET Framework that provides a declarative programming style by raising the level of abstraction and simplifying the application into a rich set of operations to transform any object that implements the `IEnumerable` interface. The most common operations are mapping, sorting, and filtering. LINQ operators accept behavior as the parameter that usually can be passed in the form of lambda expressions. In this case, the lambda expression provided will be applied to each single item of the sequence. With the introduction of LINQ and lambda expressions, FP becomes a reality in .NET. You can make queries run in parallel using all the cores of the development system to convert LINQ to PLINQ by adding the extension `.AsParallel()` to the query. PLINQ can be defined as a concurrency engine for executing LINQ queries. The objective of parallel programming is to maximize processor utilization with increased throughput in a multicore architecture. For a multicore computer, your application should recognize and scale performance to the number of available processor cores. The best way to write parallel applications isn’t to think about parallelism, and PLINQ fits this abstraction perfectly because it takes care of all the underlying requirements, such as partitioning the sequences into smaller chunks to run individually and applying the logic to each item of each subsequence. Does that sound familiar? That’s because PLINQ implements the Fork/Join model underneath, as shown in figure 5.1.  Figure 5.1 A PLINQ execution model. Converting a LINQ query to PLINQ is as simple as applying the `AsParallel()` extension method, which runs in parallel the execution using a Fork/Join pattern. In this figure, the input characters are transformed in parallel into numbers. Notice that the order of the input elements isn’t preserved. As a rule of thumb, every time there is a `for` or `for-each` loop in your code that does something with a collection, without performing a side effect outside the loop, consider transforming the loop into a LINQ. Then benchmark the execution and evaluate whether the query could be a fit to run in parallel using PLINQ. The advantage of using PLINQ, compared to a parallel `for` loop, is its capability of handling automatically aggregation of temporary processing results within each running thread that executes the query. ### 5.1.1 How is PLINQ more functional? PLINQ is considered an ideal functional library, but why? Why consider the PLINQ version of code more functional than the original `Parallel.For` loop? With `Parallel.For`, you’re telling the computer what to do: * Loop through the collection. * Verify if the number is prime. * If the number is prime, add it to a local accumulator. * When all iterations are done, add the accumulator to a shared value. By using LINQ/PLINQ, you can tell the computer what you want in the form of a sentence: “Given a range from 0 to 1,000,000, where the number is prime, sum them all.” FP emphasizes writing declarative code over imperative code. Declarative code focuses on what you want to achieve rather than how to achieve it. PLINQ tends to emphasize the intent of the code rather than the mechanism and is, therefore, much more functional. In addition, FP favors the use of functions to raise the level of abstraction, which aims to hide complexity. In this regard, PLINQ raises the concurrency programming model abstraction by handling the query expression and analyzing the structure to decide how to run in parallel, which maximizes performance speed. FP also encourages combining small and simple functions to solve complex problems. The PLINQ pipeline fully satisfies this tenet with the approach of chaining pieces of extension methods together. Another functional aspect of PLINQ is the absence of mutation. The PLINQ operators don’t mutate the original sequence, but instead return a new sequence as a result of the transformation. Consequently, the PLINQ functional implementation gives you predictable results, even when the tasks are executed in parallel. ### 5.1.2 PLINQ and pure functions: the parallel word counter Now let’s consider an example where a program loads a series of text files from a given folder and then parses each document to provide the list of the 10 most frequently used words. The process flow is the following (shown in figure 5.2): 1. Collect the files from a given folder path. 2. Iterate the files. 3. For each text file, read the content. 4. For each line, break it down into words. 5. Transform each word into uppercase, which is useful for comparison. 6. Group the collection by word. 7. Order by higher count. 8. Take the first 10 results. 9. Project the result into tabular format (a dictionary).  Figure 5.2 Representation of the flow process to count the times each word has been mentioned. First, the files are read from a given folder, then each text file is read, and the content is split in lines and single words to be grouped by. The following listing defines this functionality in the `WordsCounter` method, which takes as input the path of a folder and then calculates how many times each word has been used in all files. This listing shows the `AsParallel` command in bold. Listing 5.1 Parallel word-counting program with side effects ``` public static Dictionary<string, int> WordsCounter(string source) { var wordsCount = (from filePath in Directory.GetFiles(source, "*.txt") ① .**AsParallel()** ② from line in File.ReadLines(filePath) from word in line.Split(' ') select word.ToUpper()) .GroupBy(w => w) .OrderByDescending(v => v.Count()).Take(10); ③ return wordsCount.ToDictionary(k => k.Key, v => v.Count()); } ``` The logic of the program follows the previously defined flow step by step. It’s declarative, readable, and runs in parallel, but there’s a hidden problem. It has a side effect. The method reads files from the filesystem, generating an I/O side effect. As mentioned previously, a function or expression is said to have a side effect if it modifies a state outside its scope or if its output doesn’t depend solely on its input. In this case, passing the same input to a function with side effects doesn’t guarantee to always produce the same output. These types of functions are problematic in concurrent code because a side effect implies a form of mutation. Examples of impure functions are getting a random number, getting the current system time, reading data from a file or a network, printing something to a console, and so forth. To understand better why reading data from a file is a side effect, consider that the content of the file could change any time, and whenever the content of the file changes, it can return something different. Furthermore, reading a file could also yield an error if in the meantime it was deleted. The point is to expect that the function can return something different every time it’s called. Due to the presence of side effects, there are complexities to consider: * Is it really safe to run this code in parallel? * Is the result deterministic? * How can you test this method? A function that takes a filesystem path may throw an error if the directory doesn’t exist or if the running program doesn’t have the required permissions to read from the directory. Another point to consider is that with a function run in parallel using PLINQ, the query execution is deferred until its materialization. *Materialization* is the term used to specify when a query is executed and produces a result. For this reason, successive materialization of a PLINQ query that contains side effects might generate different results due to the underlying data that might have changed. The result isn’t deterministic. This could happen if a file is deleted from the directory between different calls, and then throws an exception. Moreover, functions with side effects (also called *impure*) are hard to test. One possible solution is to create a testing directory with a few text files that cannot change. This approach requires that you know how many words are in these files, and how many times they have been used to verify the correctness of the function. Another solution is to mock the directory and the data contained, which can be even more complex than the previous solution. A better approach exists: remove the side effects and raise the level of abstraction, simplifying the code while decoupling it from external dependencies. But what are side effects? What’s a pure function, and why should you care? ### 5.1.3 Avoiding side effects with pure functions One principle of functional programming is purity. *Pure functions* are those without side effects, where the result is independent of state, which can change with time. That is, pure functions always return the same value when given the same inputs. This listing shows pure pure functions in C#. Listing 5.2 Pure functions in C# ``` public static string AreaCircle(int radius) => Math.Pow(radius, 2) * Math.PI; ① public static int Add(int x, int y) => x + y; ① ``` The listing is an example of side effects that are functions that mutate state, setting values of global variables. Because variables live in the block where they’re declared, a variable that’s defined globally introduces possible collision and affects the readability and maintainability of the program. This requires extra checking of the current value of the variable at any point and each time it’s called. The main problem of dealing with side effects is that they make your program unpredictable and problematic in concurrent code, because a side effect implies a form of mutation. Imagine passing the same argument to a function and each time obtaining a different outcome. A function is said to have side effects if it does any of the following: * Performs any I/O operation (this includes reading/writing to the filesystem, to a database, or to the console) * Mutates global state and any state outside of the function’s scope * Throws exceptions At first, removing side effects from a program can seem extremely limiting, but there are numerous benefits to writing code in this style: * Easy to reason about the correctness of your program. * Easy to compose functions for creating new behavior. * Easy to isolate and, therefore, easy to test, and less bug-prone. * Easy to execute in parallel. Because pure functions don’t have external dependencies, their order of execution (evaluation) doesn’t matter. As you can see, introducing pure functions as part of your toolset immediately benefits your code. Moreover, the result of pure functions depends precisely on their input, which introduces the property of *referential transparency*. A program inevitably needs side effects to do something useful, of course, and functional programming doesn’t prohibit side effects, but rather encourages minimizing and isolating them. ### 5.1.4 Isolate and control side effects: refactoring the parallel word counter Let’s re-evaluate Listing 5.1, the `WordsCounter` example. How can you isolate and control side effects in this code? ``` static Dictionary<string, int> WordsCounter(string source) { var wordsCount = (from filePath in **Directory.GetFiles(source,****"*.txt") ** ① .AsParallel() from linein File.ReadLines(filePath) from word in line.Split(' ') select word.ToUpper()) .GroupBy(w => w) .OrderByDescending(v => v.Count()).Take(10); return wordsCount.ToDictionary(k => k.Key, v => v.Count()); } ``` The function can be split into a pure function at the core and a pair of functions with side effects. The I/O side effect cannot be avoided, but it can be separated from the pure logic. In this listing, the logic to count each word mentioned per file is extracted, and the side effects are isolated. Listing 5.3 Decoupling and isolating side effects ``` static Dictionary<string, int> **PureWordsPartitioner** ① (IEnumerable<IEnumerable<string>> content) => (from lines in content.**AsParallel**() ② from line in lines from word in line.Split(' ') select word.ToUpper()) .GroupBy(w => w) .OrderByDescending(v => v.Count()).Take(10) .ToDictionary(k => k.Key, v => v.Count()); ③ static Dictionary<string, int> WordsPartitioner(string source) { var contentFiles = (from filePath in Directory.GetFiles(source, "*.txt") let lines = File.ReadLines(filePath) select lines); return **PureWordsPartitioner**(contentFiles); ④ } ``` The new function `PureWordsPartitioner` is pure, where the result depends only on the input argument. This function is side effect free and easy to prove correct. Conversely, the method `WordsPartitioner` is responsible for reading a text file from the filesystem, which is a side effect operation, and then aggregating the results from the analysis. As you can see from the example, separating the pure from the impure parts of your code not only facilitates testing and optimization of the pure parts, but will also make you more aware of the side effects of your program and help you avoid making the impure parts bigger than they need to be. Designing with pure functions and decoupling side effects from pure logic are the two basic tenets that functional thinking brings to the forefront. ## 5.2 Aggregating and reducing data in parallel In FP, a *fold,* also known as *reduce and accumulate*, is a higher-order function that reduces a given data structure, typically a sequence of elements, into a single value. Reduction, for example, could return an average value for a series of numbers, calculating a summation, maximum value, or minimum value. The `fold` function takes an initial value, commonly called the *accumulator*, which is used as a container for intermediate results. As a second argument it takes a binary expression that acts as a *reduction* function to apply against each element in the sequence to return the new value for the accumulator. In general, in reduction you take a binary operator—that is, a function with two arguments—and compute it over a vector or set of elements of size *n*, usually from left to right. Sometimes, a special seed value is used for the first operation with the first element, because there’s no previous value to use. During each step of the iteration, the binary expression takes the current element from the sequence and the accumulator value as inputs, and returns a value that overwrites the accumulator. The final result is the value of the last accumulator, as shown in figure 5.3.  Figure 5.3 The `fold` function reduces a sequence to a single value. The function `(f)`, in this case, is multiplication and takes an initial accumulator with a value of `1`. For each iteration in the sequence (5, 7, 9), the function applies the calculation to the current item and accumulator. The result is then used to update the accumulator with the new value. The `fold` function has two forms, right-fold and left-fold, depending on where the first item of the sequence to process is located. The right-fold starts from the first item in the list and iterates forward; the left-fold starts from the last item in the list and iterates backward. This section covers the right-fold because it’s most often used. For the remainder of the section, the term *fold* will be used in place of *right-fold*. The `fold` function is particularly useful and interesting: it’s possible to express a variety of operations in terms of aggregation, such as `filter`, `map`, and `sum`. The `fold` function is probably the most difficult to learn among the other functions in list comprehension, but one of the most powerful. If you haven’t read it yet, I recommend “Why Functional Programming Matters,” by John Hughes ([www.cs.kent.ac.uk/people/staff/dat/miranda/whyfp90.pdf](http://www.cs.kent.ac.uk/people/staff/dat/miranda/whyfp90.pdf)).^(This paper goes into detail about the high applicability and importance of the `fold` function in FP. This listing uses F# and `fold` to demonstrate the implementation of a few useful functions.) Listing 5.4 Implementing `max` and `map` using the F# `fold` function ``` let map (projection:'a -> 'b) (sequence:seq<'a>) = ① sequence |> Seq.fold(fun acc item -> (projection item)::acc) [] let max (sequence:seq<int>) = ② sequence |> Seq.fold(fun acc item -> max item acc) 0 let filter (predicate:'a -> bool) (sequence:seq<'a>) = ③ sequence |> Seq.fold(fun acc item -> if predicate item = true then item::acc else acc) [] let length (sequence:seq<'a>) = ④ sequence |> Seq.fold(fun acc item -> acc + 1) 0 ④ ``` The equivalent of `fold` in LINQ in C# is `Aggregate`. This listing uses the C# `Aggregate` function to implement other useful functions. Listing 5.5 Implementing `Filter` and `Length` using LINQ `Aggregate` in C# ``` IEnumerable<T> Map<T, R>(IEnumerable<T> sequence, Func<T, R> projection){ return sequence.Aggregate(new List<R>(), (acc, item) => { ① acc.Add(projection(item)); return acc; }); } int Max(IEnumerable<int> sequence) { ② return sequence.Aggregate(0, (acc, item) => Math.Max(item, acc)); } IEnumerable<T> Filter<T>(IEnumerable<T> sequence, Func<T, bool> predicate){ return sequence.Aggregate(new List<T>(), (acc, item) => { ③ if (predicate(item)) acc.Add(item); return acc; }); } int Length<T>(IEnumerable<T> sequence) { ④ return sequence.Aggregate(0, (acc, _) => acc + 1); } ``` Because of the inclusion of .NET list-comprehension support for parallelism, including the LINQ `Aggregate` and `Seq.fold` operators, the implementation of these functions in C# and F# can be easily converted to run concurrently. More details about this conversion are discussed in the next sections. ### 5.2.1 Deforesting: one of many advantages to folding Reusability and maintainability are a few advantages that the `fold` function provides. But one special feature that this function permits is worth special mention. The `fold` function can be used to increase the performance of a list-comprehension query. *List comprehension* is a construct, similar to LINQ/PLINQ in C#, to facilitate list-based queries on existing lists ([`en.wikipedia.org/wiki/List_comprehension`](https://en.wikipedia.org/wiki/List_comprehension)). How can the `fold` function increase the performance speed of a list query regardless of parallelism? To answer, let’s analyze a simple PLINQ query. You saw that the use of functional constructs, like LINQ/PLINQ in .NET, transforms the original sequence avoiding mutation, which in strict-evaluated programming languages such as F# and C# often leads to the generation of intermediate data structures that are unnecessary. This listing shows a PLINQ query that filters and then transforms a sequence of numbers to calculate the sum of the even values times two (doubled). The parallel execution is in bold. Listing 5.6 PLINQ query to sum the double of even numbers in parallel ``` var data = new int[100000]; for(int i = 0; i < data.Length; i++) data[i]=i; long total = data.**AsParallel()** ① .Where(n => n % 2 == 0) .Select(n => n + n) .Sum(x => (long)x); ② ``` In these few lines of code, for each `Where` and `Select` of the PLINQ query, there’s a generation of intermediate sequences that unnecessarily increase memory allocation. In the case of large sequences to transform, the penalty paid to the GC to free memory becomes increasingly higher, with negative consequences for performance. The allocation of objects in memory is expensive; consequently, optimization that avoids extra allocation is valuable for making functional programs run faster. Fortunately, the creation of these unnecessary data structures can often be avoided. The elimination of intermediate data structures to reduce the size of temporary memory allocation is referred to as *deforesting*. This technique is easily exploitable with the higher-order function `fold`, which takes the name `Aggregate` in LINQ. This function is capable of eliminating intermediate data-structure allocations by combining multiple operations, such as `filter` and `map`, in a single step, which would otherwise have an allocation for each operation. This code example shows a PLINQ query to sum the double of even numbers in parallel using the `Aggregate` operator: ``` long total = data.**AsParallel****()**.Aggregate(0L, (acc, n) => n % 2 == 0 ? acc + (n + n) : acc); ``` The PLINQ function `Aggregate` has several overloads; in this case, the first argument `0` is the initial value of the accumulator `acc`, which is passed and updated each iteration. The second argument is the function that performs an operation against the item from the sequence, and updates the value of the accumulator `acc`. The body of this function merges the behaviors of the previously defined `Where`, `Select`, and `Sum` PLINQ extensions, producing the same result. The only difference is the execution time. The original code ran in 13 ms; the updated version of the code, deforesting the function, ran in 8 ms. Deforesting is a productive optimization tool when used with eager data structures, such as lists and arrays; but lazy collections behave a little differently. Instead of generating intermediate data structures, lazy sequences store the function to be mapped and the original data structure. But you’ll still have better performance speed improvement compared to a function that isn’t deforested. ### 5.2.2 Fold in PLINQ: Aggregate functions The same concepts you learned about the `fold` function can be applied to PLINQ in both F# and C#. As mentioned earlier, PLINQ has the equivalent of the `fold` function called `Aggregate`. The PLINQ `Aggregate` is a right-fold. Here’s one of its overloaded signatures: ``` public static TAccumulate Aggregate<TSource, TAccumulate>( this IEnumerable<TSource> source, TAccumulate seed, Func<TAccumulate, TSource, TAccumulate> func); ``` The function takes three arguments that map to the sequence source: the sequence source to process, the initial accumulator seed, and the function `func`, which updates the accumulator for each element. The best way to understand how `Aggregate` works is with an example. In the example in the sidebar, you’ll parallelize the k-means clustering algorithm using PLINQ and the `Aggregate` function. The example shows how remarkably simple and performant a program becomes by using this construct. For the data source used as input in the k-means clustering algorithm, you’ll use the “white wine quality” public records (figure 5.4), available for download at [`mng.bz/9mdt`](http://mng.bz/9mdt).  Figure 5.4 The result of running the k-means algorithms using C# LINQ for the serial version of code and C# PLINQ for the parallelized version. The centroids are the large points in both clusters. Each image represents one iteration of the k-means algorithm, with 11 centroids in the cluster. Each iteration of the algorithm computes the centroid of each cluster and then assigns each point to the cluster with the closest centroid. The full implementation of the k-means program is omitted because of the length of the code; only the relevant excerpt of the code is shown in listings 5.7 and 5.8. But the full code implementation, in both F# and C#, is available and downloadable in the source code for this book. Let’s review two core functions: `GetNearestCentroid` and `UpdateCentroids`. `GetNearestCentroid` is used to update the clusters, as shown in Listing 5.7. For every data input, this function finds the closest centroid assigned to the cluster to which the input belongs (in bold). Listing 5.7 Finding the closest centroid (updating the clusters) ``` double[] GetNearestCentroid(double[][] centroids, double[] center){ return centroids.**Aggregate**((centroid1, centroid2) => ① Dist(center, centroid2) < Dist(center, centroid1) ? centroid2 : centroid1); } ``` The `GetNearestCentroid` implementation uses the `Aggregate` function to compare the distances between the centroids to find the nearest one. During this step, if the inputs in any of the clusters aren’t updated because a closer centroid is not found, then the algorithm is complete and returns the result. The next step, shown in Listing 5.8, after the clusters are updated, is to update the centroid locations. `UpdateCentroids` calculates the center for each cluster and shifts the centroids to that point. Then, with the updated centroid values, the algorithm repeats the previous step, running `GetNearestCentroid` until it finds the closest result. These operations continue running until a convergence condition is met, and the positions of the cluster centers become stable. The bold code highlights commands discussed in more depth following the listing. The following implementation of the k-means clustering algorithm uses FP, sequence expressions with PLINQ, and several of the many built-in functions for manipulating data. Listing 5.8 Updating the location of the centroids ``` double[][] UpdateCentroids(double[][] centroids) { var partitioner = **Partitioner.Create**(data, true); ① var result = partitioner.**AsParallel()** ② .WithExecutionMode(**ParallelExecutionMode.ForceParallelism**) ③ .GroupBy(u => GetNearestCentroid(centroids, u)) .Select(points => points .Aggregate( seed: new double[N], ④ func: (acc, item) => acc.**Zip**(item, (a, b) => a + b).ToArray()) ⑤ .Select(items => items / points.Count()) .ToArray()); return result.ToArray(); } ``` With the `UpdateCentroids` function, there’s a great deal of processing to compute, so the use of PLINQ can effectively parallelize the code, thereby increasing the speed. The PLINQ query in the body of `UpdateCentroids` performs aggregation in two steps. The first uses the `GroupBy` function, which takes as an argument a function that provides the key used for the aggregation. In this case, the key is computed by the previous function `GetNearestCentroid`. The second step, mapping, which runs the `Select` function, calculates the centers of new clusters for each given point. This calculation is performed by the `Aggregate` function, which takes the list of points as inputs (the location coordinates of each centroid) and calculates their centers mapped to the same cluster using the local accumulator `acc` as shown in Listing 5.8. The accumulator is an array of doubles with size *N*, which is the *dimensionality* (the number of characteristics/measurements) of the data to process. The value *N* is defined as a constant in the parent class because it never changes and can be safely shared. The `Zip` function threads together the nearest centroids (points) and the accumulator sequences. Then, the center of that cluster is recomputed by averaging the position of the points in the cluster. The implementation details of the algorithm aren’t crucial; the key point is that the description of the algorithm is translated precisely and directly into PLINQ using `Aggregate`. If you try to re-implement the same functionality without the `Aggregate` function, the program runs in ugly and hard-to-understand loops with mutable shared variables. The following listing shows the equivalent of the `UpdateCentroids` function without the help of the `Aggregate` function. The bold code is discussed further following the listing. Listing 5.9 `UpdateCentroids` function implemented without `Aggregate` ``` double[][] UpdateCentroidsWithMutableState(double[][] centroids) { var result = data.**AsParallel()** .GroupBy(u => **GetNearestCentroid**(centroids, u)) .Select(points => { var res = new double[N]; foreach (var x in points) ① for (var i = 0; i < N; i++) res[i] += x[i]; ② var count = points.Count(); for (var i = 0; i < N; i++) res[i] /= count; ② return res; }); return result.ToArray(); } ``` Figure 5.5 shows benchmark results of running the k-means clustering algorithm. The benchmark was executed in a quad-core machine with 8 GB of RAM. The algorithms tested are the sequential LINQ, the parallel PLINQ, and the parallel PLINQ using a custom partitioner.  Figure 5.5 Benchmark running the k-means algorithm using a quad-core machine with 8 GB of RAM. The algorithms tested are the sequential LINQ and the parallel PLINQ with a variant of a tailored partitioner. The parallel PLINQ runs in 0.481 seconds, which is three times faster than the sequential LINQ version, which runs in 1.316 seconds. A slight improvement is the PLINQ with tailored partitioner that runs in 0.436 sec, which is 11% faster than the original PLINQ version. The benchmark results are impressive. The parallel version of the k-means algorithm using PLINQ runs three times faster than the sequential version in a quad-core machine. The PLINQ partitioner version, shown in Listing 5.8, is 11% faster than the PLINQ version. An interesting PLINQ extension is used in the function `UpdateCentroids`. The `WithExecutionMode(ParallelExecution Mode.ForceParallelism)` extension is used to notify the TPL scheduler that the query must be performed concurrently. The two options to configure `ParallelExecutionMode` are `ForceParallelism` and `Default`. The `ForceParallelism` enumeration forces parallel execution. The `Default` value defers to the PLINQ query for the appropriate decision on execution. In general, a PLINQ query isn’t absolutely guaranteed to run in parallel. The TPL scheduler doesn’t automatically parallelize every query, but it can decide to run the entire query, or only a part, sequentially, based upon factors such as the size and complexity of the operations and the current state of the available computer resources. The overhead involved in enabling parallelizing execution is more expensive than the speedup that’s obtained. But cases exist when you want to force the parallelism because you may know more about the query execution than PLINQ can determine from its analysis. You may be aware that a delegate is expensive, and consequently the query will absolutely benefit from parallelization, for example. The other interesting extension used in the `UpdateCentroids` function is the custom partitioner. When parallelizing k-means, you divided the input data into chunks to avoid creating parallelism with excessively fine granularity: ``` var partitioner = Partitioner.Create(data, true) ``` The `Partitioner<T>` class is an abstract class that allows for static and dynamic partitioning. The default TPL `Partitioner` has built-in strategies that automatically handle the partitioning, offering good performance for a wide range of data sources. The goal of the TPL `Partitioner` is to find the balance between having too many partitions (which introduces overhead) and having too few partitions (which underutilizes the available resources). But situations exist where the default partitioning may not be appropriate, and you can gain better performance from a PLINQ query by using a tailored partitioning strategy. In the code snippet, the custom partitioner is created using an overloaded version of the `Partitioner.Create` method, which takes as an argument the data source and a flag indicating which strategy to use, either dynamic or static. When the flag is true, the partitioner strategy is dynamic, and static otherwise. Static partitioning often provides speedup on a multicore computer with a small number of cores (two or four). Dynamic partitioning aims to load balance the work between tasks by assigning an arbitrary size of chunks and then incrementally expanding the length after each iteration. It’s possible to build sophisticated partitioners ([`mng.bz/48UP`](http://mng.bz/48UP)) with complex strategies. ## **Understanding how partitioning works** In PLINQ, there are four kinds of partitioning algorithms: * *Range partitioning* works with a data source with a defined size. Arrays are part of this category: ``` int[] data = Enumerable.Range(0, 1000).ToArray(); data.AsParallel().Select(n => Compute(n)); ``` * *Stripped partitioning*is the opposite of `Range`. The data source size isn’t predefined, so the PLINQ query fetches one item at a time and assigns it to a task until the data source becomes empty. The main benefit of this strategy is that the load can be balanced between tasks: ``` IEnumerable<int> data = Enumerable.Range(0, 1000); data.AsParallel().Select(n => Compute(n)); ``` * *Hash partitioning*uses the value’s hash code to assign elements with the same hash code to the same task (for example, when a PLINQ query performs a `GroupBy)`. * *Chunk partitioning*works with incremental chunk size, where each task fetches from the data source a chunk of items, whose length expands with the number of iterations. With each iteration, larger chunks keep the task busy as much as possible. ### 5.2.3 Implementing a parallel Reduce function for PLINQ Now you’ve learned about the power of aggregate operations, which are particularly suited to scalable parallelization on multicore hardware due to low memory consumption and deforesting optimization. The low memory bandwidth occurs because aggregate functions produce less data than they ingest. For example, other aggregate functions such as `Sum()` and `Average()` reduce a collection of items to a single value. That’s the concept of reduction: it takes a function to reduce a sequence of elements to a single value. The PLINQ list extensions don’t have a specific function `Reduce`, as in F# list comprehension or other functional programming languages such as Scala and Elixir. But after having gained familiarity with the `Aggregate` function, the implementation of a reusable `Reduce` function is an easy job. This listing shows the implementation of a `Reduce` function in two variants. The bold highlights annotated code. Listing 5.10 Parallel `Reduce` function implementation using `Aggregate` ``` static TSource Reduce<TSource>(this **ParallelQuery**<TSource> source, Func<TSource, TSource, TSource> reduce) => **ParallelEnumerable**.Aggregate(source, ① (item1, item2) => reduce(item1, item2)); ② static TValue Reduce<TValue>(this IEnumerable<TValue> source, TValue seed, Func<TValue, TValue, TValue> reduce) => source.**AsParallel**() .Aggregate( ① seed: seed, updateAccumulatorFunc: (local, value) => reduce(local, value), ② combineAccumulatorsFunc: (overall, local) => reduce(overall, local), ③ resultSelector: overall => overall); ④ int[] source = Enumerable.Range(0, 100000).ToArray(); int result = source.**AsParallel**() .Reduce((value1, value2) => value1 + value2); ⑤ ``` The first `Reduce` function takes two arguments: the sequence to reduce and a delegate (function) to apply for the reduction. The delegate has two parameters: the partial result and the next element of the collection. The underlying implementation uses `Aggregate` to treat the first item from the source sequence as an accumulator. The second variant of the `Reduce` function takes an extra parameter `seed`, which is used as the initial value to start the reduction with the first value of the sequence to aggregate. This version of the function merges the results from multiple threads. This action creates a potential dependency on both the source collection and the result. For this reason, each thread uses thread-local storage, which is non-shared memory, to cache partial results. When each operation has completed, the separate partial results are combined into a final result. `updateAccumulatorFunc` calculates the partial result for a thread. The `combineAccumulatorsFunc` function merges the partial results into a final result. The last parameter is `resultSelector`, which is used to perform a user-defined operation on the final results. In this case, it returns the original value. The remainder of the code is an example to apply the `Reduce` function to calculate the sum of a given sequence in parallel. #### Associativity and commutativity for deterministic aggregation The order of computation of an aggregation that runs in parallel using PLINQ (or `PSeq`) applies the `Reduce` function differently than the sequential version. In Listing 5.8, the sequential result was computed in a different order than the parallel result, but the two outputs are guaranteed to be equal because the operator + (plus) used to update the centroid distances has the special properties of associativity and commutativity. This is the line of code used to find the nearest centroid: `Dist(center, centroid2) < Dist(center, centroid1)` This is the line of code used to find updates to the centroids: ``` points .Aggregate( seed: new double[N], func: (acc, item) => acc.Zip(item, (a, b) => a + b).ToArray()) .Select(items => items / points.Count()) ``` In FP, the mathematical operators are functions. The + (plus) is a binary operator, so it performs on two values and manipulates them to return a result. A function is *associative* when the order in which it’s applied doesn’t change the result. This property is important for *reduction operations*. The + (plus) operator and the * (multiply) operator are associative because: (*a* + *b*) + *c* = *a* + (*b* + *c*) (*a* * *b*) * *c* = *a* * (*b* * *c*) A function is *commutative* when the order of the operands doesn’t change its output, so long as each operand is accounted for. This property is important for *combiner operations*. The + (plus) operator and the * (multiply) operator are commutative because: *a + b + c = b + c + a* *a * b * c = b * c * a* #### Why does this matter? Using these properties, it’s possible to partition the data and have multiple threads operating independently on their own chunks, achieving parallelism, and still return the correct result at the end. The combination of these properties permits the implementation of a parallel pattern such as Divide and Conquer, Fork/Join, or MapReduce. For a parallel aggregation in PLINQ `PSeq` to work correctly, the applied operation must be both associative and commutative. The good news is that many of the most popular kinds of reduction functions are both. ### 5.2.4 Parallel list comprehension in F#: PSeq At this point, you understand that declarative programming lends itself to data parallelization, and PLINQ makes this particularly easy. PLINQ provides extension methods and higher-order functions that can be used from both C# and F#. But a wrapper module around the functionality provided in PLINQ for F# makes the code more idiomatic than working with PLINQ directly. This module is called `PSeq`, and it provides the parallel equivalent of the functions part of the `Seq` computation expression module. In F#, the `Seq` module is a thin wrapper over the .NET `IEnumerable<T>` class to mimic similar functionality. In F#, all the built-in sequential containers, such as arrays, lists, and sets are subtypes of the `Seq` type. In summary, if parallel LINQ is the right tool to use in your code, then the `PSeq` module is the best way to use it in F#. This listing shows the implementation of the `updateCentroids` function using `PSeq` in idiomatic F# (in bold). Listing 5.11 Idiomatic F# using `PSeq` to implement `updateCentroids` ``` let updateCentroids centroids = data |> **PSeq**.groupBy (nearestCentroid centroids) |> **PSeq**.map (fun (_,points) -> Array.init N (fun i -> points |> PSeq.averageBy (fun x -> x.[i]))) |> **PSeq**.sort |> **PSeq**.toArray ``` The code uses the F# pipe operator `|>` for construct pipeline semantics to compute a series of operations as a chain of expressions. The applied higher-order operations with the `PSeq.groupBy` and `PSeq.map` functions follow the same pattern as the original `updateCentroids` function. The `map` function is the equivalent of `Select` in PLINQ. The `Aggregate` function `PSeq.averageBy` is useful because it replaces boilerplate code (necessary in PLINQ) that doesn’t have such functionality built in. ### 5.2.5 Parallel arrays in F# Although the `PSeq` module provides many familiar and useful functional constructs, such as `map` and `reduce`, these functions are inherently limited by the fact that they must act upon sequences and not divisible ranges. Consequently, the functions provided by the `Array.Parallel` module from the F# standard library typically scale much more efficiently when you increase the number of cores in the machine. Listing 5.12 Parallel sum of prime numbers using F# `Array.Parallel` ``` let len = 10000000 let isPrime n = ① if n = 1 then false elif n = 2 then true else let boundary = int (Math.Floor(Math.Sqrt(float(n)))) [2..boundary - 1] |> Seq.forall(fun i -> n % i <> 0) let primeSum = [|0.. len|] |> Array.Parallel.filter (fun x-> isPrime x) ② |> Array.sum ``` The `Array.Parallel` module provides versions of many ordinary higher-order array functions that were parallelized using the TPL `Parallel` class. These functions are generally much more efficient than their `PSeq` equivalents because they operate on contiguous ranges of arrays that are divisible in chunks rather than linear sequences. The `Array.Parallel` module provided by the F# standard library includes parallelized versions of several useful aggregate operators, most notably `map`. The function filter is developed using the `Array.Parallel.choose` function. See the the book’s source code. #### Different strategies in data parallelism: vector check We’ve covered fundamental programming design patterns that originated with functional programming and are used to process data in parallel quickly. As a refresher, these patterns are shown in table 5.1. Table 5.1 Parallel data patterns analyzed so far | **Pattern** | **Definition** | **Pros and cons** | | --- | --- | --- | | Divide and Conquer | Recursively breaks down a problem into smaller problems until these become small enough to be solved directly. For each recursive call, an independent task is created to perform a sub-problem in parallel. The most popular example of the Divide and Conquer algorithm is Quicksort. | With many recursive calls, this pattern could create extra overhead associated with parallel processing that saturates the processors. | | Fork/Join | This pattern aims to split, or fork, a given data set into chunks of work so that each individual chunk is executed in parallel. After each parallel portion of work is completed, the parallel chunks are then merged, or joined, together. The parallel section forks could be implemented using recursion, similar to Divide and Conquer, until a certain task’s granularity is reached. | This provides efficient load balancing. | | Aggregate/Reduce | This pattern aims to combine in parallel all the elements of a given data set into a single value, by evaluating tasks on independent processing elements. This is the first level of optimization to consider when parallelizing loops with shared state. | The elements of a data set to be reduced in parallel should satisfy the associative property. Using an associative operator, any two elements of the data set can be combined into one. | The parallel programming abstractions in table 5.1 can be quickly implemented using the multicore development features available in .NET. Other patterns will be analyzed in the rest of the book. In the next section, we’ll examine the parallel MapReduce pattern. ## 5.3 Parallel MapReduce pattern MapReduce is a pattern introduced in 2004 in the paper “MapReduce: Simplified Data Processing on Large Clusters,” by Jeffrey Dean and Sanjay Ghemawat ([`research.google.com/archive/mapreduce-osdi04.pdf`](https://research.google.com/archive/mapreduce-osdi04.pdf)). MapReduce provides particularly interesting solutions for big data analysis and to crunch massive amounts of data using parallel processing. It’s extremely scalable and is used in some of the largest distributed applications in the world. Additionally, it’s designed for processing and generating large data sets to be distributed across multiple machines. Google’s implementation runs on a large cluster of machines and can process terabytes of data at a time. The design and principles are applicable for both a single machine (single-core) on a smaller scale, and in powerful multicore machines. This chapter focuses on applying data parallelism in a single multicore computer, but the same concepts can be applied for partitioning the work among multiple computers in the network. In chapters 11 and 12, we’ll cover the agent (and actor) programming model, which can be used to achieve such network distribution of tasks. The idea for the MapReduce model (as shown in figure 5.6) is derived from the functional paradigm, and its name originates from concepts known as `map` and `reduce` combinators. Programs written using this more functional style can be parallelized over a large cluster of machines without requiring knowledge of concurrent programming. The actual runtime can then partition the data, schedule, and handle any potential failure.  Figure 5.6 A schematic illustration of the phases of a MapReduce computation. The MapReduce pattern is composed primarily of two steps: map and reduce. The `Map` function is applied to all items and produces intermediate results, which are merged using the `Reduce` function. This pattern is similar to the Fork/Join pattern because after splitting the data in chunks, it applies in parallel the tasks map and reduce independently. In the image, a given data set is partitioned into chunks that can be performed independently because of the absence of dependencies. Then, each chunk is transformed into a different shape using the `Map` function. Each `Map` execution runs simultaneously. As each map chunk operation completes, the result is passed to the next step to be aggregated using the `Reduce` function. (The aggregation can be compared to the `join` step in the Fork/Join pattern.) The MapReduce model is useful in domains where there’s a need to execute a massive number of operations in parallel. Machine learning, image processing, data mining, and distributed sort are a few examples of domains where MapReduce is widely used. In general, the programming model is based upon five simple concepts. The order isn’t a rule and can be changed based on your needs: 1. Iteration over input 2. Computation of key/value pairs from each input 3. Grouping of all intermediate values by key 4. Iteration over the resulting groups 5. Reduction of each group The overall idea of MapReduce is to use a combination of maps and reductions to query a stream of data. To do so, you can map the available data to a different format, producing a new data item in a different format for each original datum. During a `Map` operation you can also reorder the items, either before or after you map them. Operations that preserve the number of elements are `Map` operations. If you have many elements you may want to reduce the number of them to answer a question. You can filter the input stream by throwing away elements you don’t care about. You can combine elements into a single aggregated element and return only those that provide the answer you seek. Mapping before reducing is one way to do it, but you can also `Reduce` before you `Map` or even `Reduce`, `Map`, and then `Reduce` even more, and so on. In summary, MapReduce maps (translates data from one format to the other and orders the data) and reduces (filters, groups, or aggregates) the data. ### 5.3.1 The Map and Reduce functions MapReduce is composed of two main phases: * `Map` receives the input and performs a `map` function to produce an output of intermediate key/value pairs. The values with the same key are then joined and passed to the second phase. * `Reduce` aggregates the results from `Map` by applying a function to the values associated with the same intermediate key to produce a possibly smaller set of values. The important aspect of MapReduce is that the output of the `Map` phase is compatible with the input of the `Reduce` phase. This characteristic leads to functional compositionality. ### 5.3.2 Using MapReduce with the NuGet package gallery In this section, you’ll learn how to implement and apply the MapReduce pattern using a program to download and analyze NuGet packages from the online gallery. NuGet is a package manager for the Microsoft development platform including .NET, and the NuGet gallery is the central package repository used by all package developers. At the time of writing, there were over 800,000 NuGet packages. The purpose of the program is to rank and determine the five most important NuGet packages, calculating the importance of each by adding its score rate with the score values of all its dependencies. Because of the intrinsic relation between MapReduce and FP, Listing 5.13 will be implemented using F# and `PSeq` to support data parallelism. The C# version of the code can be found in the downloadable source code. It’s possible to use the same basic idea to find other information, such as the dependencies for a package that you are using, what the dependencies of dependencies are, and so on. Listing 5.13 defines both the `Map` and `Reduce` functions. The `Map` function transforms a NuGet package input into a key/value pair data structure, where the key is the name of the package and the value is the rank value (`float`). This data structure is defined as a sequence of key/value types because each package could have dependencies, which will be evaluated as part of the total score. The `Reduce` function takes as an argument the name of the package with the sequence of associated score/values. This input matches the output of the previous `Map` function. Listing 5.13 `PageRank` object encapsulating the `Map` and `Reduce` functions ``` type PageRank (ranks:seq<string*float>) = let mapCache = Map.ofSeq ranks ① let getRank (package:string) = match mapCache.TryFind package with ② | Some(rank) -> rank | None -> 1.0 member this.Map (package:NuGet.NuGetPackageCache) = let score = (getRank package.PackageName) /float(package.Dependencies.Length) ③ package.Dependencies ③ |> Seq.map (fun (Domain.PackageName(name,_),_,_) -> (name, score)) member this.Reduce (name:string) (values:seq<float>) = (name, Seq.sum values) ④ ``` The `PageRank` object encapsulates the `Map` and `Reduce` functions, providing easy access to the same underlying data structure ranks. Next, you need to build the core of the program, MapReduce. Using FP style, you can model a reusable MapReduce function, passing the functions as input for both `Map` and `Reduce` phases. Here is the implementation of `mapF`. Listing 5.14 `mapF` function for the first phase of the MapReduce pattern ``` let mapF M (map:'in_value -> seq<'out_key * 'out_value>) (inputs:seq<'in_value>) = inputs |> PSeq.withExecutionMode ParallelExecutionMode.ForceParallelism ① |> PSeq.withDegreeOfParallelism M ② |> PSeq.collect (map) ③ |> PSeq.groupBy (fst) ④ |> PSeq.toList ⑤ ``` The `mapF` function takes as its first parameter an integer value `M`, which determines the level of parallelism to apply. This argument is intentionally positioned first because it makes it easier to partially apply the function to reuse with the same value. Inside the body of `mapF` the degree of parallelism is set using `PSeq.withDegreeOfParallelism` `M`. This extension method is also used in PLINQ. The purpose of the configuration is to restrict the number of threads that could run in parallel, and it isn’t a coincidence that the query is eagerly materialized exercising the last function `PSeq.toList`. If you omit `PSeq.withDegreeOfParallelism`, then the degree of parallelism isn’t guaranteed to be enforced. In the case of a multicore single machine, it’s sometimes useful to limit the number of running threads per function. In the parallel MapReduce pattern, because `Map` and `Reduce` are executed simultaneously, you might find it beneficial to constrain the resources dedicated for each step. For example, the value `maxThreads` defined as ``` let maxThreads = max (Environment.ProcessorCount / 2, 1) ``` could be used to restrict each of the two MapReduce phases to half of the system threads. The second argument of `mapF` is the core `map` function, which operates on each input value and returns the output sequence key/value pairs. The type of the output sequence can be different from the type of the inputs. The last argument is the sequence of input values to operate against. After the `map` function, you implement the `reduce` aggregation. This listing shows the implementation of the aggregation function `reduceF` to run the second and final result. Listing 5.15 `reduceF` function for the second phase of MapReduce ``` let reduceF R (reduce:'key -> seq<'value> -> 'reducedValues) (inputs:('key * seq<'key * 'value>) seq) = inputs |> **PSeq**.withExecutionMode ParallelExecutionMode.**ForceParallelism** ① |> **PSeq**.withDegreeOfParallelism R ② |> **PSeq**.map (fun (key, items) -> ③ items |> Seq.map (snd) ④ |> reduce key) ④ |> **PSeq**.toList ``` The first argument `R` of the `reduceF` function has the same purpose of setting the degree of parallelism as the argument `M` in the previous `mapF` function. The second argument is the `reduce` function that operates on each key/values pair of the input parameter. In the case of the NuGet package example, the key is a string for the name of the package, and the sequence of values is the list of ranks associated with the package. Ultimately, the input argument is the sequence of key/value pairs, which matches the output of the `mapF` function. The `reduceF` function generates the final output. After having defined the functions `map` and `reduce`, the last step is the easy one: putting everything together (in bold). Listing 5.16 `mapReduce` composed of the `mapF` and `reduceF` functions ``` let mapReduce (inputs:seq<'in_value>) (map:'in_value -> seq<'out_key * 'out_value>) (reduce:'out_key -> seq<'out_value> -> 'reducedValues) M R = inputs |> (**mapF** M map >> **reduceF** R reduce) ① ``` Because the output of the `map` function matches the input of the `reduce` function, you can easily compose them together. The listing shows this functional approach in the implementation of the `mapReduce` function. The `mapReduce` function arguments feed the underlying `mapF` and `reduceF` functions. The same explanation applies. The important part of this code is the last line. Using the F# built-in pipe operator (`|>)` and forward composition operator (`>>)`, you can put everything together. This code shows how you can now utilize the function `mapReduce` from Listing 5.16 to calculate the NuGet package ranking: ``` let executeMapReduce (ranks:(string*float)seq) = let M,R = 10,5 let data = Data.loadPackages() let pg = MapReduce.Task.PageRank(ranks) **mapReduce** data (pg.**Map**) (pg.**Reduce**) M R ``` The class `pg` (`PageRank`) is defined in Listing 5.13 to provide the implementation of both the `map` and `reduce` functions. The arbitrary values `M` and `R` set how many workers to create for each step of the MapReduce. After the implementation of the `mapF` and `reduceF` functions, you compose them to implement a `mapReduce` function that can be conveniently utilized as a new function.  Figure 5.7 Benchmark running the MapReduce algorithm using a quad-core machine with 8 GB of RAM. The algorithms tested are sequential LINQ, parallel F# `PSeq`, and PLINQ with a variant of tailored partitioner. The parallel version of MapReduce that uses PLINQ runs in 1.136 seconds, which is 38% faster than the sequential version using regular LINQ in C#. The F# `PSeq` performance is almost equivalent to PLINQ, as expected, because they share the same technology underneath. The parallel C# PLINQ with tailored partitioner is the fastest solution, running in 0.952 sec, about 18% faster than ordinary PLINQ, and twice as fast as the baseline (the sequential version). As expected, the serial implementation in figure 5.7 is the slowest one. Because the parallel versions F# `PSeq` and C# PLINQ use the same underlying library, the speed values are almost equivalent. The F# `PSeq` version is a little slower with a higher CPU time because of the extra overhead induced by the wrapper. The fastest MapReduce is the PLINQ parallel version with tailored partitioner, which can be found in the source code for this book. This is the result of the five most important NuGet packages: ``` Microsoft.NETCore.Platforms : 6033.799540 Microsoft.NETCore.Targets : 5887.339802 System.Runtime : 5661.039574 Newtonsoft.Json : 4009.295644 NETStandard.Library : 1876.720832 ``` In MapReduce, any form of reduction performed in parallel can offer different results than a serial one if the operation isn’t associative. #### MapReduce and a little math The associative and commutative properties introduced earlier in this chapter prove the correctness and deterministic behavior of aggregative functions. In parallel and functional programming, the adoption of mathematical patterns is common to guarantee accuracy in the implementation of a program. But a deep knowledge of mathematics isn’t necessary. Can you determine the values of *x* in the following equations? 9 + *x* = 12 2 < *x* < 4 If you answered 3 for both functions, good news, you already know all the math that it takes to write deterministic concurrent programs in functional style using techniques from linear algebra ([`en.wikipedia.org/wiki/Linear_algebra`](https://en.wikipedia.org/wiki/Linear_algebra)). #### What math can do to simplify parallelism: monoids The property of association leads to a common technique known as a monoid ([`wiki.haskell.org/Monoid`](https://wiki.haskell.org/Monoid)), which works with many different types of values in a simple way. The term *monoid* (not to be confused with monad: [`wiki.haskell.org/Monad`](https://wiki.haskell.org/Monad)) comes from mathematics, but the concept is applicable to computer programming without any math knowledge. Essentially, monoids are operations whose output type is the same as the input, and which must satisfy some rules: associativity, identity, and closure. You read about associativity in the previous section. The *identity* property says that a computation can be executed multiple times without affecting the result. For example, an aggregation that is associative and commutative can be applied to one or more reduction steps of the final result without affecting the output type. The *closure* rule enforces that the input(s) and output(s) types of a given function must be the same. For example, addition takes two numbers as parameters and returns a third number as a result. This rule can be expressed in .NET with a function signature `Func<T, T, T>` that ensures that all arguments belong to the same type, in opposition to a function signature such as `Func<T1, T2, R>`. In the k-means example, the function `UpdateCentroids` satisfies these laws because the operations used in the algorithm are monoidal—a scary word that hides a simple concept. This operation is addition (for reduce). The addition function takes two numbers and produces output of the same type. In this case, the identity element is 0 (zero) because a value 0 can be added to the result of the operation without changing it. Multiplication is also a monoid, with the identity element 1\. The value of a number multiplied by 1 does not change. Why is it important that an operation returns a result of the same type as the input(s)? Because it lets you chain and compose multiple objects using the monoidal operation, making it simple to introduce parallelism for these operations. The fact that an operation is associative, for example, means you can fold a data structure to reduce a list sequentially. But if you have a monoid, you can reduce a list using a `fold` (`Aggregate`), which can be more efficient for certain operations and also allows for parallelism. To calculate the factorial of the number 8, the multiplication operations running in parallel on a two-core CPU should look something like table 5.2. Table 5.2 Parallel calculation of the factorial product of the number 8 | | **Core 1** | **Core 2** | | --- | --- | --- | | Step 1 | M1 = 1 * 2 | M2 = 3 * 4 | | Step 2 | M3 = M2 * 5 | M4 = 6 * M1 | | Step 3 | M5 = M4 * 7 | M6= 8 * M3 | | Step 4 | idle | M7= M6 * M5 | | Result | 40320 | The same result can be achieved using parallel aggregation in either F# or C# to reduce the list of numbers 1 to 8 into a single value: ``` [1..8] |> PSeq.reduce (*) Enumerable.Range(1,8).AsParallel().Reduce((a,b)=> a * b); ``` Because multiplication is a monoidal operation for the type `integer`, you can be sure that the result of running the operation in parallel is deterministic. ## Summary * Parallel LINQ and F# `PSeq` both originate from the functional paradigm and are designed for data parallelism, simple code, and high performance. By default, these technologies take the logical processor count as the degree of parallelism. These technologies handle the underlying processes regarding the partitioning of sequences in smaller chunks, set the degree of parallelism counting the logical machine cores, and run individually to process each subsequence. * PLINQ and F# `PSeq` are higher-level abstraction technologies that lie on top of multithreading components. These technologies aim to reduce the time of query execution, engaging the available computer resources. * The .NET Framework allows tailored techniques to maximize performance in data analysis. Consider value types over reference types to reduce memory problems, which otherwise could provoke a bottleneck due to the generation of too many GCs. * Writing pure functions, or functions without side effects, makes it easier to reason about the correctness of your program. Furthermore, because pure functions are deterministic, when passing the same input, the output doesn’t change. The order of execution doesn’t matter, so functions without side effects can easily be executed in parallel. * Designing with pure functions and decoupling side effects from pure logic are the two basic tenets that functional thinking brings to the forefront. * Deforestation is the technique to eliminate the generation of intermediate data structures to reduce the size of temporary memory allocation, which benefits the performance of the application. This technique is easily exploitable with the higher-order function `Aggregate` in LINQ. It combines multiple operations in a single step, such as `filter` and `map`, which would have otherwise had an allocation for each operation. * Writing functions that are associative and commutative permits the implementation of a parallel pattern like Divide and Conquer, Fork/Join, or MapReduce. * The MapReduce pattern is composed primarily of two steps: map and reduce. The `Map` function is applied to all items and produces intermediate results, which are merged using the `Reduce` function. This pattern is similar to Fork/Join because after splitting the data into chunks, it applies in parallel the tasks`Map` and `Reduce`independently. # 6 Real-time event streams: functional reactive programming **This chapter covers** * Understanding queryable event streams * Working with Reactive Extensions (Rx) * Combining F# and C# to make events first-class values * Processing high-rate data streams * Implementing a Publisher-Subscriber pattern We’re used to responding to events in our lives daily. If it starts to rain, we get an umbrella. If the daylight in a room begins to dim, we flip the switch to turn on the electric light. The same is true in our applications, where a program must react to (or handle) events caused by something else happening in the application or a user interacting with it. Almost every program must handle events, whether they’re the receipt of an HTTP request for a web page on a server, a notification from your favorite social media platform, a change in your filesystem, or a simple click of a button. Today’s challenge for applications isn’t reacting to one event, but reacting to a constant high volume of events in (near) real time. Consider the humble smartphone. We depend on these devices to be constantly connected to the internet and continuously sending and receiving data. These multidevice interconnections can be compared to billions of sensors that are acquiring and sharing information, with the need for real-time analysis. In addition, this unstoppable massive stream of notifications continues to flow from the internet fire hose, requiring that the system be designed to handle back-pressure ([`en.wikipedia.org/wiki/Back_pressure`](https://en.wikipedia.org/wiki/Back_pressure)) and notifications in parallel. *Back-pressure* refers to a situation where the event-fetching producer is getting too far ahead of the event-processing consumer. This could generate potential spikes in memory consumption and possibly reserve more system resources for the consumer until the consumer is caught up. More details regarding back-pressure are covered later in the chapter. It’s predicted that by 2020 more than 50 billion devices will be connected to the internet. Even more stunning is that this expansion of digital information shows no signs of slowing any time soon! For this reason, the ability to manipulate and analyze high-speed data streams in real time will continue to dominate the field of data (big data) analysis and digital information. Numerous challenges exist to using a traditional programming paradigm for the implementation of these kinds of real-time processing systems. What kinds of technologies and tools can you use to simplify the event programming model? How can you concurrently handle multiple events without thinking concurrently? The answers lie with reactive programming. > In computing, reactive programming is a programming paradigm that maintains a continuous interaction with their environment, but at a speed which is determined by the environment, not the program itself. —Gèrard Berry (“Real Time Programming: Special Purpose or General Purpose Languages,” Inria (1989), [`mng.bz/br08`](http://mng.bz/br08)) *Reactive programming* is programming with everlasting asynchronous streams of events made simple. On top of that, it combines the benefits of functional programming for concurrency, which you’ve seen in earlier chapters, with the reactive programming toolkit to make event-driven programming highly beneficial, approachable, and safe. Furthermore, by applying various high-order operators on streams, you can easily achieve different computational goals. By the end of this chapter, you’ll know how reactive programming avoids the problems that occur when using imperative techniques to build reactive systems. You’ll design and implement event-driven applications, coupled with support for asynchronicity, that are responsive, scalable, and loose. ## 6.1 Reactive programming: big event processing *Reactive* *programming,* not to be confused with functional reactive programming, refers to a programming paradigm that focuses on listening and processing events asynchronously as a data stream, where the availability of new information drives the logic forward rather than having the control flow driven by a thread of execution. A common example of reactive programming is a spreadsheet, where cells contain literal values or formulas such as *C1 = A1 + B1* or, in Excel lingo, `C1 = Sum(A1:B1)`. In this case, the value in the cell C1 is evaluated based on the values in other cells. When the value of one of the other cells B1 or A1 changes, the value of the formula automatically recalculates to update the value of C1, as seen in figure 6.1.  Figure 6.1 This Excel spreadsheet is reactive, meaning that cell C1 reacts to a change of value in either cell A1 or B1 through the formula `Sum(A1:B1)`. The same principle is applicable for processing data to notify the system when a change of state occurs. Analyzing data collections is a common requirement in software development. In many circumstances, your code could benefit from using a reactive event handler. The reactive event handler allows a compositional reactive semantic to express operations, such as `Filter` and `Map`, against events elegantly and succinctly, rather than a regular event handler, which is designed to handle simple scenarios with limited flexibility. The reactive programming approach to event handling is different from the traditional approach because events are treated as streams. This provides the ability to manipulate effortless events with different features, such as the ability to filter, map, and merge, in a declarative and expressive way. For example, you might design a web service that filters the event stream to a subset of events based on specified rules. The resulting solution uses reactive programming to capture the intended behaviors by describing the operations in a declarative manner, which is one of the tenets of FP. This is one reason why it’s commonly called functional reactive programming; but this term requires further explanation. What is *functional reactive programming* (FRP)? Technically, FRP is a programming paradigm based on *values that change over time,* using a set of simple compositional Reactive operators (`behavior` and `event`) that, in turn, are used to build more complex operators. This programming paradigm is commonly used for developing UIs, robotics, and games, and for solving distributed and networked system challenges. Due to the powerful and simplified compositional aspect of FRP, several modern technologies use FRP principles to develop sophisticated systems. For example, the programming languages Elm ([`elm-lang.org`](http://elm-lang.org)) and Yampa ([`wiki.haskell.org/Yampa`](https://wiki.haskell.org/Yampa)) are based on FRP. From the industry standpoint, FRP is a set of different but related functional programming technologies combined under the umbrella of event handling. The confusion is derived from similarity and misrepresentation—using the same words in different combinations: * *Functional programming* is a paradigm that treats computation as the evaluation of an expression and avoids changing state and mutable data. * *Reactive programming* is a paradigm that implements any application where there’s a real-time component. Reactive programming is becoming increasingly more important in the context of real-time stream processing for big data analytics. The benefits of reactive programming are increased use of computing resources on multicore and multi-CPU hardware by providing a straightforward and maintainable approach for dealing with asynchronous and no-blocking computation and IO. Similarly, FRP offers the right abstraction to make event-driven programming highly beneficial, approachable, safe, and composable. These aspects let you build real-time, reactive programs with clean and readable code that’s easy to maintain and expand, all without sacrificing performance. The reactive programming concept is non-blocking asynchronous based, reverting control from “asking” to “waiting” for events, as shown in figure 6.2. This principle is called *inversion of* *control* ([`martinfowler.com/bliki/InversionOfControl.html`](http://martinfowler.com/bliki/InversionOfControl.html)), also referred to as the Hollywood Principle (don’t call me, I’ll call you).  Figure 6.2 Real-time reactive programming promotes non-blocking (asynchronous) operations that are designed to deal with high-volume, high-velocity event sequences over time by handling multiple events simultaneously, possibly in parallel. Reactive programming aims to operate on a high-rate sequence of events over time, simplifying the concurrent aspect of handling multiple events simultaneously (in parallel). Writing applications that are capable of reacting to events at a high rate is becoming increasingly important. Figure 6.3 shows a system that’s processing a massive number of tweets per minute. These messages are sent by literally millions of devices, representing the event sources, into the system that analyzes, transforms, and then dispatches the tweets to those registered to read them. It’s common to annotate a tweet message with a hashtag to create a dedicated channel and group of interests. The system uses a hashtag to filter and partition the notifications by topic.  Figure 6.3 Millions of devices represent a rich source of events, capable of sending a massive number of tweets per minute. A real-time reactive system can handle the massive quantities of tweets as an event stream by applying non-blocking (asynchronous) operations (`merge`, `filter`, and `map`) and then dispatching the tweets to the listeners (consumers). Every day, millions of devices send and receive notifications that could overflow and potentially crash the system if it isn’t designed to handle such a large number of sustained events. How would you write such a system? A close relationship exists between FP and reactive programming. Reactive programming uses functional constructors to achieve composable event abstraction. As previously mentioned, it’s possible to exploit higher-order operations on events such as `map`, `filter`, and `reduce`. The term FRP is commonly used to refer to reactive programming, but this isn’t completely correct. ## 6.2 .NET tools for reactive programming The .NET Framework supports events based on a delegate model. An event handler for a subscriber registers a chain of events and triggers the events when called. Using an imperative programming paradigm, the event handlers need a mutable state to keep track of the subscriptions to register a callback, which wraps the behavior inside a function to limit composability. Here’s a typical example of a button-click event registration that uses an event handler and anonymous lambda: ``` public MainWindow() { myButton.Click **+=** new System.EventHandler(myButton_Click); myButton.Click **+=** (sender, args) => MessageBox.Show(“Bye!”); } void myButton_Click(object sender, RoutedEventArgs e) { MessageBox.Show(“Hello!”); } ``` This pattern is the primary reason that .NET events are difficult to compose, almost impossible to transform, and, ultimately, the reason for accidental memory leaks. In general, using the imperative programming model requires a shared mutable state for communication between events, which could potentially hide undesirable side effects. When implementing complex event combinations, the imperative programming approach tends to be convoluted. Additionally, providing an explicit callback function limits your options to express code functionality in a declarative style. The result is a program that’s hard to understand and, over time, becomes impossible to expand and to debug. Furthermore, .NET events don’t provide support for concurrent programs to raise an event on a separate thread, making them a poor fit for today’s reactive and scalable applications. Events in .NET are the first step toward reactive programming. Events have been part of the .NET Framework since the beginning. In the early days of the .NET Framework, events were primarily used when working with graphical user interfaces (GUIs). Today, their potential is being explored more fully. With the .NET Framework, Microsoft introduced a way to reason and treat events as first-class values by using the F# `Event` (and `Observable`) module and .NET Reactive Extensions (Rx). Rx lets you compose events easily and declaratively in a powerful way. Additionally, you can handle events as a data stream capable of encapsulating logic and state, ensuring that your code is without side effects and mutable variables. Now your code can fully embrace the functional paradigm, which focuses on listening and processing events asynchronously. ### 6.2.1 Event combinators—a better solution Currently, most systems get a callback and process these events when and as they happen. But if you consider events as a stream, similar to lists or other collections, then you can use techniques for working with collections or processing events, which eliminates the need for callbacks. The F# list comprehension, introduced in chapter 5, provides a set of higher-order functions, such as `filter` and `map`, for working with lists in a declarative style: ``` let squareOfDigits (chars:char list) |> List.filter (fun c -> Char.IsDigit c && int c % 2 = 0) |> List.map (fun n -> int n * int n) ``` In this code, the function `squareOfDigits` takes a list of characters and returns the square of the digits in the list. The first function `filter` returns a list with elements for which a given predicate is true; in this case, the characters are even digits. The second function, `map`, transforms each element `n` passed into an integer and calculates its square value `n` * `n`. The pipeline operator (`|>`) sequences the operations as a chain of evaluations. In other words, the result of the operation on the left side of the equation will be used as an argument for the next operation in the pipeline. The same code can be translated into LINQ to be more C# friendly: ``` List<int> SquareOfDigits(List<char> chars) => chars.Where(c => char.IsDigit(c) && char.GetNumericValue(c) % 2 == 0) .Select(c => (int)c * (int)c).ToList(); ``` This expressive programming style is a perfect fit for working with events. Different than C#, F# has the advantage of treating events intrinsically (natively) as first-class values, which means you can pass them around like data. Additionally, you can write a function that takes an event as an argument to generate a new event. Consequently, an event can be passed into functions with the pipe operator (`|>`) like any other value. This design and method of using events in F# is based on combinators, which look like programming using list comprehension against sequences. The event combinators are exposed in the F# module `Event` that can be used to compose events: ``` textBox.KeyPress |> **Event**.filter (fun c -> Char.IsDigit c.KeyChar && int c.KeyChar % 2 = 0) |> **Event**.map (fun n -> int n.KeyChar * n.KeyChar) ``` In this code, the `KeyPress` keyboard event is treated as a stream, which is filtered to ignore events that aren’t interesting, so that the final computation occurs only when the keys pressed are digits. The biggest benefit of using higher-order functions is a cleaner *separation of concerns.^(1) * C# can reach the same level of expressiveness and compositionality using .NET Rx, as briefly described later in this chapter. ### 6.2.2 .NET interoperability with F# combinators Using F# event combinators, you can write code using an algebra of events that aims to separate complex events from simple ones. Is it possible to take advantage of the F# event combinators module to write more declarative C# code? Yes. Both .NET programming languages F# and C# use the same *common language runtime (*CLR), and both are compiled into an intermediate language (IL) that conforms to the *Common Language* *Infrastructure (*CLI) specification. This makes it possible to share the same code. In general, events are understood by all .NET languages, but F# events are used as first-class values and, consequently, require only a small amount of extra attention. To ensure that the F# events can be used by other .NET languages, the compiler must be notified by decorating the event with the `[<CLIEvent>]` attribute. It’s convenient and efficient to use the intrinsic compositional aspect of F# event combinators to build sophisticated event handlers that can be consumed in C# code. Let’s see an example to better understand how F# event combinators work and how they can easily be consumed by other .NET programming languages. Listing 6.1 shows how to implement a simple game to guess a secret word using F# event combinators. The code registers two events: the `KeyPress` event from the WinForms control passed into the construct of `KeyPressedEventCombinators`, and the `Elapsed` time event from `System.Timers.Timer`. The user enters text—in this case, only letters are allowed (no digits)—until either the secret word is guessed or the timer (the given time `interval)` has elapsed. When the user presses a key, the filter and event combinators transform the event source into a new event through a *chain of expressions*. If the time expires before the secret word is guessed, a notification triggers a “Game Over” message; otherwise, it triggers the “You Won!” message when the secret word matches the input. Listing 6.1 F# Event combinator to manage key-down events ``` type KeyPressedEventCombinators(secretWord, interval, ➥ control:#System.Windows.Forms.Control) = let evt = let timer = new System.Timers.Timer(float interval) ① let timeElapsed = timer.Elapsed |> Event.map(fun _ -> 'X') ② let keyPressed = control.KeyPress |> Event.filter(fun kd -> Char.IsLetter kd.KeyChar) |> Event.map(fun kd -> Char.ToLower kd.KeyChar) ③ timer.Start() ① keyPressed |> Event.merge timeElapsed ④ |> Event.scan(fun acc c -> if c = 'X' then "Game Over" else let word = sprintf "%s%c" acc c if word = secretWord then "You Won!" else word ) String.Empty ⑤ [<CLIEvent>] member this.OnKeyDown = evt ⑥ ``` The type `KeyPressedEventCombinators` has a constrvuctor parameter control, which refers to any object that derives from `System.Windows.Forms.Control`. The `#` annotation in F# is called a flexible type, which indicates that a parameter is compatible with a specified base type ([`mng.bz/FSp2`](http://mng.bz/FSp2)). The `KeyPress` event is linked to the `System.Windows.Forms.Control` base control passed into the type constructor, and its event stream flows into the F# event-combinators pipeline for further manipulation. The `OnKeyDown` event is decorated with the attribute `[<CLIEvent>]` to be exposed (published) and visible to other .NET languages. In this way, the event can be subscribed to and consumed from C# code, obtaining reactive programmability by referencing the F# library project. Figure 6.4 presents the F# event-combinators pipeline, where the `KeyPress` event stream runs through the series of functions linked as a chain.  Figure 6.4 An event-combinator pipeline showing how two event flows manage their own set of events before being merged and passed into the accumulator. When a key is pressed on a WinForms control, the `filter` event checks whether the key pressed is a letter, and then `map` retrieves the lowercase version of that letter to scan. When the time elapses on the timer, the `map` operator passes an “X” as in “no value” to the `scan` function. The event-combinator chain in figure 6.4 is complex, but it demonstrates the simplicity of expressing such a convoluted code design using events as first-class values. The F# event combinators raise the level of abstraction to facilitate higher-order operations for events, which makes the code more readable and easier to understand when compared to an equivalent program written in imperative style. Implementing the program using the typical imperative style requires creating two different events that communicate the state of the timer and maintain the state of the text with a shared mutable state. The functional approach with event combinators removes the need for a shared immutable state; and, moreover, events are composable. To summarize, the main benefits of using F# event combinators are: * *Composability*—You can define events that capture complex logic from simpler events. * *Declarative—*The code written using F# event combinators is based on functional principles; therefore, event combinators express what to accomplish, rather than how to accomplish a task. * *Interoperability—*F# event combinators can be shared across .NET languages so the complexity can be hidden in a library. ## 6.3 Reactive programming in .NET: Reactive Extensions (Rx) The .NET Rx library facilitates the composition of asynchronous event-based programs using observable sequences. Rx combines the simplicity of LINQ-style semantics for manipulating collections and the power of asynchronous programming models to use the clean async/await patterns from .NET 4.5\. This powerful combination enables a toolset that lets you treat event streams using the same simple, composable, and declarative style used for data collections (`List` and `Array`, for example). Rx provides a domain-specific language (DSL) that provides a significantly simpler and more fluent API for handling complex, asynchronous event-based logic. Rx can be used to either develop a responsive UI or increase scalability in a server-side application. In the nutshell, Rx is a set of extensions built for the `IObservable<T>` and `IObserver<T>` interfaces that provide a generalized mechanism for push-based notifications based on the Observer pattern from the Gang of Four (GoF) book. The Observer design pattern is based on events, and it’s one of the most common patterns in OOP. This pattern publishes changes made to an object’s state (the observable) to other objects (the observers) that subscribe to notifications describing any changes to that object (shown in figure 6.5).  Figure 6.5 The original Observer pattern from the GoF book Using GoF terminology, the `IObservable` interfaces are *subjects*, and the `IObserver` interfaces are *observers*. These interfaces, introduced in .NET 4.0 as part of the `System` namespace, are an important component in the reactive programming model. Here’s the definition for both `IObserver` and `IObservable` interface signatures in C#: ``` public interface IObserver<T> { void OnCompleted(); void OnError(Exception exception); void OnNext(T value); } public interface IObservable<T> { IDisposable Subscribe(IObserver<T> observer); } ``` These interfaces implement the Observer pattern, which allows Rx to create an `observable` from existing .NET CLR events. Figure 6.6 attempts to clarify the original Unified Modeling Language (UML) for the Observer pattern from the GoF book.  Figure 6.6 The Observer pattern is based on an object called `Subject`, which maintains a list of dependencies (called observers) and automatically notifies the observers of any change of state to `Subject`. This pattern defines a one-to-many relationship between the observer subscribers, so that when an object changes state, all its dependencies are notified and updated automatically. The `IObservable<T>` functional interface ([www.lambdafaq.org/what-is-a-functional-interface](http://www.lambdafaq.org/what-is-a-functional-interface)) only implements the method `Subscribe`. When this method is called by an observer, a notification is triggered to publish the new item through the `IObserver<T>.OnNext` method. The `IObservable` interface, as the name implies, can be considered a source of data that’s constantly observed, which automatically notifies all registered observers of any state changes. Similarly, notifications for errors and completion are published through the `IObserver<T>.OnError` and `IObserver<T>.OnCompleted` methods, respectively. The `Subscribe` method returns an `IDisposable` object, which acts as a handle for the subscribed observer. When the `Dispose` method is called, the corresponding observer is detached from the `Observable`, and it stops receiving notifications. In summary: * `IObserver<T>.OnNext` supplies the observer with new data or state information. * `IObserver<T>.OnError` indicates that the provider has experienced an error condition. * `IObserver<T>.OnCompleted` indicates that the observer finished sending notifications to observers. The same interfaces are used as a base definition for the F# `IEvent<'a>` type, which is the interface used to implement the F# event combinators previously discussed. As you can see, the same principles are applied with a slightly different approach to achieve the same design. The ability to code multiple asynchronous event sources is the main advantage of Rx. ### 6.3.1 From LINQ/PLINQ to Rx The .NET LINQ/PLINQ query providers, as discussed in chapter 5, operate as a mechanism against an in-memory sequence. Conceptually, this mechanism is based on a pull model, which means that the items of the collections are pulled from the query during its evaluation. This behavior is represented by the iterator pattern of `IEnumerable<T> - IEnumerator<T>`, which can cause a block while it’s waiting for data to iterate. In opposition, Rx treats events as a data stream by defining the query to react over time as events arrive. This is a push model, where the events arrive and autonomously travel through the query. Figure 6.7 shows both models.  Figure 6.7 Push vs. pull model. The `IEnumerable/IEnumerator` pattern is based on the pull model, which asks for new data from the source. Alternatively, the `IObservable/IObserver` pattern is based on the push model, which receives a notification when new data is available to send to the consumer. In the reactive case, the application is passive and causes no blocking in the data-retrieval process. ### 6.3.2 IObservable: the dual IEnumerable The Rx push-based event model is abstracted by the `IObservable<T>` interface, which is the dual of the `IEnumerable<T>` interface.^(2) While the term *duality* can sound daunting, it’s a simple and powerful concept. You can compare duality to the two sides of a coin, where the opposite side can be inferred from the one exposed. *In the context of computer science, this concept has been exploited by De Morgan’s Law,^(3) which achieves the duality between conjunction `&&` (AND) and disjunction `||` (OR) to prove that negation distributes over both conjunction and disjunction: ``` !(a || b) == !a && !b !(a && b) == !a || !b ``` Like the inverse of LINQ, where LINQ exposes a set of extension methods for the `IEnumerable` interface to implement a pull-based model over collections, Rx exposes a set of extension methods for the `IObservable` interface to implement a push-based model over events. Figure 6.8 shows the dual relationship between these interfaces.  Figure 6.8 Dual relationship between the `IObserver` and `IEnumerator` interfaces, and the `IObservable` and `IEnumerable` interfaces. This dual relationship is obtained by reversing the arrow in the functions, which means swapping the input and output. As figure 6.8 shows, the `IObservable` and `IObserver` interfaces are obtained by reversing the arrow of the corresponding `IEnumerable` and `IEnumerator` interfaces. Reversing the arrow means swapping the input and output of a method. For example, the current property of the `IEnumerator` interface has this signature: ``` Unit (or void in C#) -> get ‘a ``` Reversing the arrow of this property, you can obtain its dual: `Unit <- set ‘a`. This signature in the reciprocal `IObserver` interface matches the `OnNext` method, which has the following signature: ``` set ‘a -> Unit (or void in C#) ``` The `GetEnumerator` function takes no arguments and returns `IEnumerator<T>`, which returns the next item in the list through the `MoveNext` and `Current` functions. The reverse `IEnumerable` method can be used to traverse the `IObservable`, which pushes data into the subscribed `IObserver` by invoking its methods. ### 6.3.3 Reactive Extensions in action Combining existing events is an essential characteristic of Rx, which permits a level of abstraction and compositionality that’s otherwise impossible to achieve. In .NET, events are one form of an asynchronous data source that can be consumed by Rx. To convert existing events into observables, Rx takes an event and returns an `EventPattern` object, which contains the sender and event arguments. For example, a `key-pressed` event is converted into a reactive observable (in bold): ``` Observable.**FromEventPattern**<KeyPressedEventArgs>(this.textBox, nameof(this.textBox.KeyPress)); ``` As you can see, Rx lets you handle events in a rich and reusable form. Let’s put the Rx framework into action by implementing the C# equivalent of the secret word game previously defined using the F# event combinators `KeyPressedEventCombinators.` This listing shows the implementation using this pattern and the corresponding reactive framework. Listing 6.2 Rx `KeyPressedEventCombinators` in C# ``` var timer = new System.Timers.Timer(timerInterval); var timerElapsed = Observable.FromEventPattern<ElapsedEventArgs> (timer, "Elapsed").Select(_ => 'X'); ① var keyPressed = Observable.FromEventPattern<KeyPressEventArgs> (this.textBox, nameof(this.textBox.KeyPress)); .Select(kd => Char.ToLower(kd.EventArgs.KeyChar)) .Where(c => Char.IsLetter(c)); ② timer.Start(); timerElapsed .Merge(keyPressed) ③ .Scan(String.Empty, (acc, c) => ④ { if (c == 'X') return "Game Over"; else { var word = acc + c; if (word == secretWord) return "You Won!"; else return word; } }). .Subscribe(value => this.label.BeginInvoke( (Action)(() => this.label.Text = value))); ``` The `Observable.FromEventPattern` method creates a link between the .NET event and the Rx `IObservable`, which wraps both `Sender` and `EventArgs`. In the listing, the imperative C# events for handling the key pressed (`KeyPressEventArgs`) and the elapsed timer (`ElapsedEventArgs`) are transformed into observables and then merged to be treated as a whole stream of events. Now it’s possible to construct all of the event handling as a single and concise chain of expressions. ### 6.3.4 Real-time streaming with RX An *event stream* is a channel on which a sequence of ongoing events, by order of time, arrives as values. Streams of events come from diverse sources, such as social media, the stock market, smartphones, or a computer mouse. Real-time stream processing aims to consume a live data stream that can be shaped into other forms. Consuming this data, which in many cases is delivered at a high rate, can be overwhelming, like drinking directly from a fire hose. Take, for example, the analysis of stock prices that continually change and then dispatching the result to multiple consumers, as shown in figure 6.9. The Rx framework fits well in this scenario because it handles multiple asynchronous data sources while delivering high-performance operations to combine, transform, and filter any of those data streams. At its core, Rx uses the `IObservable<T>` interface to maintain a list of dependent `IObserver<T>` interfaces that are notified automatically of any event or data change.  Figure 6.9 Event streams from different sources push data to an event transformer, which applies higher-order operations and then notifies the subscribed observers. ### 6.3.5 From events to F# observables As you may recall, F# uses events for configurable callback constructs. In addition, it supports an alternative and more advanced mechanism for configurable callbacks that are more compositional than events. The F# language treats .NET events as values of type `IEvent<'T>`, which inherits from the interface `IObservable<'T>`, the same type used by Rx. For this reason, the main F# assembly, `FSharp.Core`, already provides an `Observable` module that exposes a set of useful functions over the values of the `IObservable` interface. This is considered a subset of Rx. For example, in the following code snippet, the F# observables (in bold) are used to handle keypress and timer events from the `KeyPressedEventCombinators` example (Listing 6.2): ``` let timeElapsed = timer.Elapsed |> Observable.map(fun _ -> 'X') let keyPressed = control.KeyPress |> Observable.filter(fun c -> Char.IsLetter c) |> Observable.map(fun kd -> Char.ToLower kd.KeyChar) let disposable = keyPressed |> **Observable**.merge timeElapsed |> **Observable**.scan(fun acc c -> if c = 'X' then "Game Over" else let word = sprintf "%s%c" acc c if word = secretWord then "You Won!" else word ) String.Empty |> **Observable**.subscribe(fun text -> printfn “%s” text) ``` It’s possible to choose (and use) either `Observable` or `Event` when using F# to build reactive systems; but to avoid memory leaks, the preferred choice is `Observable`. When using the F# `Event` module, composed events are attached to the original event, and they don’t have an unsubscribe mechanism that can lead to memory leaks. Instead, the `Observable` module provides the `subscribe` operator to register a callback function. This operator returns an `IDisposable` object that can be used to stop event-stream processing and to de-register all subscribed observable (or event) handlers in the pipeline with one call of the `Dispose` method. ## 6.4 Taming the event stream: Twitter emotion analysis using Rx programming In this age of digital information where billions of devices are connected to the internet, programs must correlate, merge, filter, and run real-time analytics. The speed of processing data has moved into the realm of real-time analytics, reducing latency to virtually zero when accessing information. Reactive programming is a superb approach for handling high-performance requirements because it’s concurrency friendly and scalable, and it provides a composable asynchronous data-processing semantic. It’s estimated that in the United States there is an average of 24 million tweets per hour, amounting to almost 7,000 messages per second. This is a massive quantity of data to evaluate, and presents a serious challenge for consuming such a high-traffic stream. Consequently, a system should be designed to tame the occurrence of backpressure. This backpressure, for example, in the case of Twitter could be generated by a consumer of the live stream of data that can’t cope with the rate at which the producers emit events. The F# example in figure 6.10 illustrates a real-time analysis stream for determining the current feeling (emotion) of tweets published in the United States.  Figure 6.10 The Twitter messages push a high-rate event stream to the consumer, so it’s important to have tools like Rx to tame the continuous burst of notifications. First, the stream is throttled, then the messages are filtered, analyzed, and grouped by emotions. The result is a data stream from the incoming tweets that represents the latest status of emotions, whose values constantly update a chart and notify the subscribed observers. This example uses F# to demonstrate the existing built-in support for observables, which is missing in C#. But the same functionality can be reproduced in C#, either using.NET Rx or by referencing and consuming an F# library, where the code exposes the implemented observable. The analysis of the stream of tweets is performed by consuming and extracting the information from each message. Emotional analysis is performed using the Stanford CoreNLP library. The result of this analysis is sent to a live animated chart that takes `IObservable` as input and automatically updates the graph as the data changes. *The following listing shows the emotion analysis function and the settings to enable the Stanford CoreNLP library. Listing 6.3 Evaluating a sentence’s emotion using the CoreNLP library ``` let properties = Properties() properties.setProperty("annotators", "tokenize,ssplit,pos,parse,emotion") ➥ |> ignore IO.Directory.SetCurrentDirectory(jarDirectory) let stanfordNLP = StanfordCoreNLP(properties) ① type Emotion = | Unhappy | Indifferent | Happy ② let getEmotionMeaning value = match value with | 0 | 1 -> Unhappy | 2 -> Indifferent | 3 | 4 -> Happy ③ let evaluateEmotion (text:string) = let annotation = Annotation(text) stanfordNLP.annotate(annotation) let emotions = let emotionAnnotationClassName = SentimentCoreAnnotations.SentimentAnnotatedTree().getClass() let sentences = annotation.get(CoreAnnotations.SentencesAnnotation().getClass()) ➥ :?> java.util.ArrayList [ for s in sentences -> let sentence = s :?> Annotation let sentenceTree = sentence.get(emotionAnnotationClassName) ➥ :?> Tree let emotion = NNCoreAnnotations.getPredictedClass(sentenceTree) getEmotionMeaning emotion] (emotions.[0]) ④ ``` In the code, the F# DU defines different emotion levels (case values): `Unhappy`, `Indifferent`, and `Happy`. These case values compute the distribution percentage among the tweets. The function `evaluateEmotion` combines the text analysis from the Stanford library and returns the resulting case value (emotion). To retrieve the stream of tweets, I used the Tweetinvi library ([`github.com/linvi/tweetinvi`](https://github.com/linvi/tweetinvi)). It provides a well-documented API and, more importantly, it’s designed to run streams concurrently while managing multithreaded scenarios. You can download and install this library from the NuGet package `TweetinviAPI`. This listing shows how to create an instance for the Tweetinvi library and how to access the settings to enable interaction with Twitter. Listing 6.4 Settings to enable the Twitterinvi library ``` let consumerKey = "<your Key>" let consumerSecretKey = "<your secret key>" let accessToken = "<your access token>" let accessTokenSecret = "<your secret access token>" let cred = new TwitterCredentials(consumerKey, consumerSecretKey, ➥ accessToken, accessTokenSecret) let stream = Stream.CreateSampleStream(cred) stream.FilterLevel <- StreamFilterLevel.Low ``` This straightforward code creates an instance of the Twitter stream. The core of the Rx programming is in the following listing (highlighted in bold), where Rx and the F# `Observable` module are used in combination to handle and analyze the event stream. Listing 6.5 Observable pipeline to analyze tweets ``` let emotionMap = [(Unhappy, 0) (Indifferent, 0) (Happy, 0)] |> Map.ofSeq let observableTweets = stream.TweetReceived ① |> Observable.**throttle**(TimeSpan.FromMilliseconds(100.)) ② |> Observable.**filter**(fun args -> args.Tweet.Language = Language.English) ③ |> Observable.**groupBy**(fun args -> evaluateEmotion args.Tweet.FullText) ④ |> Observable.**selectMany**(fun args -> args |> Observable.map(fun i -> (args.Key, (max 1 i.Tweet.FavoriteCount)))) ⑤ |> Observable.**scan**(fun sm (key,count) -> match sm |> Map.tryFind key with | Some(v) -> sm |> Map.add key (v + count) | None -> sm ) emotionMap ⑥ |> Observable.**map**(fun sm -> let total = sm |> Seq.sumBy(fun v -> v.Value) ⑦ sm |> Seq.map(fun k -> let percentageEmotion = ((float k.Value) * 100.) ➥ / (float total) let labelText = sprintf "%A - %.2f.%%" (k.Key) ➥ percentageEmotion (labelText, percentageEmotion) )) ``` The result of the `observableTweets` pipeline is an `IDisposable`, which is used to stop listening to the tweets and remove the subscription from the subscribed observable. Tweetinvi exposes the event handler `TweetReceived`, which notifies the subscribers when a new tweet has arrived. The observables are combined as a chain to form the `observableTweets` pipeline. Each step returns a new observable that listens to the original observable and then triggers the resulting event from the given function. The first step in the observable channel is managing the backpressure, which is a result of the high rate of arriving events. When writing Rx code, be aware that it’s possible for the process to be overwhelmed when the event stream comes in too quickly. In figure 6.11, the system on the left has no problem processing the incoming event streams because the frequency of notifications over time has a sustainable throughput (desired flow). The system on the right struggles to keep up with a huge*number of notifications (backpressure) that it receives over time, which could potentially collapse the system. In this case, the system responds by throttling the event streams to avoid failure. The result is a different rate of notifications between the observable and an observer.* * Figure 6.11 Backpressure could negatively affect the responsiveness of a system, but it’s possible to reduce the rate of the incoming events and keep the system healthy by using the `throttle` function to manage the different rates between an observable and an observer. To avoid the problem of backpressure, the `throttle` function provides a layer of protection that controls the rate of messages, preventing them from flowing too quickly: ``` stream.TweetReceived |> Observable.**throttle**(TimeSpan.FromMilliseconds(50.)) ``` The `throttle` function reduces a rapid fire of data down to a subset, corresponding to a specific cadence (rhythm) as shown in %figures 6.9 and 6.10. `Throttle` extracts the last value from a burst of data in an observable sequence by ignoring any value that’s followed by another value in less than a time period specified. In Listing 6.5, the frequency of event propagation was throttled to no more than once every 50 ms. The next step in the pipeline is filtering events that aren’t relevant (the command is in bold): ``` |> Observable.**filter**(fun args -> args.Tweet.Language = Language.English) ``` This `filter` function ensures that only the tweets that originate using the English language are processed. The `Tweet` object, from the tweet message, has a series of properties, including the sender of the message, the hashtag, and the coordinates (*location*) that can be accessed. Next, the Rx `groupBy` operator provides the ability to partition the sequence into a series of observable groups related to a selector function. Each of these sub-observables corresponds to a unique key value, containing all the elements that share that same key value the way it does in LINQ and in SQL: ``` |> Observable.groupBy(fun args -> evaluateEmotion args.Tweet.FullText) |> Observable.selectMany(fun args -> args |> Observable.map(fun i -> (args.Key, i.Tweet.FavoriteCount))) ``` In this case, the key-value emotion partitions the event stream. The function `evaluateEmotion`, which behaves as a group selector, computes and classifies the emotion for each incoming message. Each nested observable can have its own unique operation; the `selectMany` operator is used to further subscribe these groups of observables by flattening them into one. Then, using the `map` function, the sequence is transformed into a new sequence of pairs (tuple) consisting of the `Tweet-Emotion` value and the count of how many times the tweet has been liked (or favored). After having been partitioned and analyzed, the data must be aggregated into a meaningful format. The observable `scan` function does this by pushing the result of each call to the `accumulator` function. The returned observable will trigger notifications for each computed state value, as shown in figure 6.12.  Figure 6.12 The `aggregate` function returns a single value that is the accumulation of each value from running the given function (`x,y`) against the initial accumulator `0`. The `scan` function returns a value for each item in the collection, which is the result of performing the given function against the accumulator in the current iteration. The `scan` function is like `fold`, or the LINQ `Aggregate`, but instead of returning a single value, it returns the intermediate evaluations resulting from each iteration (as shown in bold in the following code snippet). Moreover, it satisfies the functional paradigm, maintaining state in an immutable fashion. The aggregate functions (such as `scan` and `fold)` are described as the generic concept of *catamorphism (*[`wiki.haskell.org/Catamorphisms`](https://wiki.haskell.org/Catamorphisms)) in FP: ``` **< code here that passes an Observable of tweets with emotions analysis >** |> Observable.**scan**(fun sm (key,count) -> match sm |> Map.tryFind key with | Some(v) -> sm |> Map.add key (v + count) | None -> sm) emotionMap ``` This function `scan` takes three arguments: an observable that’s passed conceptually in the form of stream tweets with emotion analysis, an anonymous function to apply the underlying values of the observable to the accumulator, and an accumulator `emotionMap`. The result of the `scan` function is an updated accumulator that’s injected into the following iteration. The initial accumulator state in the previous code is used by the `scan` function in an empty F# `Map`, which is equivalent to an immutable .NET generic `Dictionary` (`System.Collections.Generic.Dictionary<K,V>`), where the key is one of the emotions and the value is the count of its related tweets. The accumulator function `scan` updates the entries of the collection with the new evaluated types and returns the updated collection as new accumulator. The last operation in the pipeline is to run the `map` function used to transform the observables of the source into the representation of the total percentage of tweets analyzed by emotions: ``` |> Observable.map(fun sm -> let total = sm |> Seq.sumBy(fun v -> v.Value) sm |> Seq.map(fun k -> let percentageEmotion = ((float k.Value) * 100.) / (float total) let labelText = sprintf "%A - %.2f.%%" (k.Key) percentageEmotion (labelText, percentageEmotion) )) ``` The transformation function is executed once for each subscribed observer. The `map` function calculates the total number of tweets from the observable passed, which contains the value of the accumulator from the previous `scan` function: ``` sm |> Seq.sumBy(fun v -> v.Value) ``` The result is returned in a format that represents the percentage of each emotion from the map table received so far. The final observable is passed into a `LiveChart`, which renders the real-time updates. Now that the code is developed, you can use the `StartStreamAsync()` function to start the process of listening and receiving the tweets and have the observable notify subscribers: ``` LiveChart.Column(observableTweets,Name= sprintf "Tweet Emotions").ShowChart() do stream.StartStreamAsync() ``` Much like the `Event` module in F#, the `Observable` module defines a set of combinators for using the `IObservable<T>` interface. The F# `Observable` module includes `add`, `filter`, `map`, `partition`, `merge`, `choose`, and `scan`. For more details, see appendix B. In the previous example, the observable functions `groupBy` and `selectMany` are part of the Rx framework. This illustrates the utility that F# provides, providing the developer options to mix and match tools to customize the best fit for the task. ### 6.4.1 SelectMany: the monadic bind operator `SelectMany` is a powerful operator that corresponds to the `bind` (or `flatMap)` operator in other programming languages. This operator constructs one monadic value from another and has the generic monadic binding signature ``` `M a -> (a -> M b) -> M b` ``` where `M` represents any elevated type that behaves as a container. In the case of observables, it has this signature: ``` IObservable<'T> -> ('T -> IObservable<'R>) -> IObservable<'R> ``` In .NET, there are several types that match this signature, such as `IObservable`, `IEnumerable`, and `Task`. Monads ([`bit.ly/2vDusZa`](http://bit.ly/2vDusZa)), despite their reputation for complexity, can be thought of in simple terms: they are containers that encapsulate and abstract a given functionality with the objective of promoting composition between elevated types and avoiding side effects. Basically, when working with monads, you can think of working with boxes (containers) that are unpacked at the last moment—when they’re needed. The main purpose of monadic computation is to make composition possible where it couldn’t be achieved otherwise. For example, by using monads in C#, it’s possible to directly sum an integer and a `Task` type from the `System.Threading.Tasks` namespace of integer (`Task<int>)` (highlighted in bold): ``` Task<int> result = **from** task **in** Task.Run<int>(() => 40) **select** task + 2; ``` The `bind`, or `SelectMany`, operation takes an elevated type and applies a function to its underlying value, returning another elevated type. An *elevated type* is a wrapper around another type, like `IEnumerable<int>`, `Nullable<bool>`, or `IObservable<Tweets>`. The meaning of `bind` depends on the monad type. For `IObservable`, each event in the observables input is evaluated to create a new observable. The resulting observables are then flattened to produce the output observable, as shown in figure 6.13.  Figure 6.13 An elevated type can be considered a special container where it’s possible to apply a function directly to the underlying type (in this case, 40). The elevated type works like a wrapper that contains a value, which can be extracted to apply a given function, after which the result is put back into the container. The `SelectMany` binder not only *flattens* data values but, as an operator, it also transforms and then flattens the nested monadic values. The underlying theory of monads is used by LINQ, which is used by the .NET compiler to interpret the `SelectMany` pattern to apply the monadic behavior. For example, by implementing the `SelectMany` extension method over the `Task` type (as highlighted in bold in the following code snippet), the compiler recognizes the pattern and interprets it as the monadic binding, allowing the special composition: ``` **Task**<R> **SelectMany**<T, R>(this **Task**<T> source, Func<T, **Task**<R>> selector) => source.ContinueWith(t => selector(t.Result)).Unwrap(); ``` With this method in place, the previous LINQ-based code will compile and evaluate to a `Task<int>` that returns 42\. Monads play an import role in functional concurrency and are covered more thoroughly in chapter 7. ## 6.5 An Rx publisher-subscriber The Publish/Subscribe pattern allows any number of publishers to communicate with any number of subscribers asynchronously via an event channel. In general, to accomplish this communication, an intermediary hub is employed to receive the notifications, which are then forwarded to subscribers. Using Rx, it becomes possible to effectively define a Publish/Subscribe pattern by using the built-in tools and concurrency model. The `Subject` type is a perfect candidate for this implementation. It implements the `ISubject` interface, which is the combination of `IObservable` and `IObserver`. This makes the `Subject` behave as both an observer and an observable, which allows it to operate like a broker to intercept notifications as an observer and to broadcast these notifications to all its observers. Think of the `IObserver` and the `IObservable` as consumer and publisher interfaces, respectively, as shown in figure 6.14.  Figure 6.14 The publisher-subscriber hub manages the communication between any number of subscribers (observers) with any number of publishers (observables). The hub, also known as a broker, receives the notifications from the publishers, which are then forwarded to the subscribers. Using the `Subject` type from Rx to represent a Publish/Subscribe pattern has the advantage of giving you the control to inject extra logic, such as `merge` and `filter`, into the notification before it’s published. ### 6.5.1 Using the Subject type for a powerful publisher-subscriber hub `Subject`s are the components of Rx, and their intention is to synchronize the values produced by an observable and the observers that consume them. `Subject`s don’t completely embrace the functional paradigm because they maintain or manage states that could potentially mutate. Despite this fact, however, they’re useful for creating an event-like observable as a field, which is a perfect fit for a Publish/Subscribe pattern implementation. The `Subject` type implements the `ISubject` interface (highlighted in bold in the following code snippet), which resides in the `System.Reactive.Subjects` namespace ``` interface **ISubject**<T, R> : **IObserver**<T>, **IObservable**<R> { } ``` or `ISubject<T>`, if the source and result are of the same type. Because a `Subject<T>` and, consequently, `ISubject<T>` are observers, they expose the `OnNext`, `OnCompleted`, and `OnError` methods. Therefore, when they’re called, the same methods are called on all the subscribed observers. Rx out of the box has different implementations of the `Subject` class, each with a diverse behavior. In addition, if the existing `Subject`s don’t satisfy your needs, then you can implement your own. The only requirement to implementing a custom subject class is satisfying the `ISubject` interface implementation. Here are the other `Subject` variants: * `ReplaySubject` behaves like a normal `Subject`, but it stores all the messages received, providing the ability to make the messages available for current and future subscribers. * `BehaviorSubject` always saves the latest available value, which makes it available for future subscribers. * `AsyncSubject`represents an asynchronous operation that routes only the last notification received while waiting for the `OnComplete` message. ### 6.5.2 Rx in relation to concurrency The Rx framework is based on a push model with support for multithreading. But it’s important to remember that Rx is single-threaded by default, and the parallel constructs that let you combine asynchronous sources must be enabled using Rx schedulers. One of the main reasons to introduce concurrency in Rx programming is to facilitate and manage offloading the payload for an event stream. This allows a set of concurrent tasks to be performed, such as maintaining a responsive UI, to free the current thread. Moreover, Rx lets you control the flow of incoming messages as specific threads to achieve high-concurrency computation. Rx is a system for querying event streams asynchronously, which requires a level of concurrency control. When multithreading is enabled, Rx programming increases the use of computing resources on multicore hardware, which improves the performance of computations. In this case, it’s possible for different messages to arrive from different execution contexts simultaneously. In fact, several asynchronous sources could be the output from separate and parallel computations, merging into the same `Observable` pipeline. In other words, observables and observers deal with asynchronous operations against a sequence of values in a push model. Ultimately, Rx handles all the complexity involved in managing access to these notifications and avoiding common concurrency problems as if they were running in a single thread. Using a `Subject` type (or any other observables from Rx), the code isn’t converted automatically to run faster or concurrently. As a default, the operation to push the messages to multiple subscribers by a `Subject` is executed in the same thread. Moreover, the notification messages are sent to all subscribers sequentially following their subscription order and possibly blocking the operation until it completes. The Rx framework solves this limitation by exposing the `ObserveOn` and `SubscribeOn` methods, which lets you register a `Scheduler` to handle concurrency. Rx schedulers are designed to generate and process events concurrently, increasing responsiveness and scalability while reducing complexity. They provide an abstraction over the concurrency model, which lets you perform operations against a stream of data moving without the need to be exposed directly to the underlying concurrent implementation. Moreover, Rx schedulers integrate support for task cancellation, error handling, and passing of state. All Rx schedulers implement the `IScheduler` interface, which can be found in the `System.Reactive.Concurrency` namespace*.* The `SubscribeOn` method determines which `Scheduler` to enable to queue messages that run on a different thread. The `ObserveOn` method determines which thread the callback function will be run in. This method targets the `Scheduler` that handles output messages and UI programming (for example, to update a WPF interface). `ObserveOn` is primarily used for UI programming and `Synchronization-Context (`[`bit.ly/2wiVBxu`](http://bit.ly/2wiVBxu)`)`interaction. In the case of UI programming, both the `SubscribeOn` and `ObserveOn` operators can be combined to better control which thread will run in each step of your observable pipeline. ### 6.5.3 Implementing a reusable Rx publisher-subscriber Armed with the knowledge of Rx and the `Subject` classes, it’s much easier to define a reusable generic `Pub-Sub` object that combines publication and subscription into the same source. In this section, you’ll first build a concurrent publisher-subscriber hub using the `Subject` type in Rx. Then you’ll refactor the previous Twitter emotion analyzer code example to exploit the new and simpler functionality provided by the Rx-based publisher-subscriber hub. The implementation of the reactive publisher-subscriber hub uses a `Subject` to subscribe and then route values to the observers, allowing multicasting notifications emitted by the sources to the observers. This listing shows the implementation of the `RxPubSub` class, which uses Rx to build the generic `Pub-Sub` object. Listing 6.6 Reactive publisher-subscriber in C# ``` public class RxPubSub<T> : IDisposable { private ISubject<T> subject; ① private List<IObserver<T>> observers = new List<IObserver<T>>(); ② private List<IDisposable> observables = new List<IDisposable>(); ③ public RxPubSub(ISubject<T> subject) { this.subject = subject; ④ } public RxPubSub() : this(new Subject<T>()) { } ④ public IDisposable Subscribe(IObserver<T> observer) { observers.Add(observer); subject.Subscribe(observer); return new Subscription<T>(observer, observers); ⑤ } public IDisposable AddPublisher(IObservable<T> observable) => observable.SubscribeOn(TaskPoolScheduler.Default).Subscribe(subject); ⑥ public IObservable<T> AsObservable() => subject.AsObservable(); ⑦ public void Dispose() { observers.ForEach(x => x.OnCompleted()); observers.Clear(); ⑧ } } class ObserverHandler<T> : IDisposable ⑨ { private IObserver<T> observer; private List<IObserver<T>> observers; public ObserverHandler(IObserver<T> observer, ➥ List<IObserver<T>> observers) { this.observer = observer; this.observers = observers; } public void Dispose() ⑨ { observer.OnCompleted(); observers.Remove(observer); } } ``` An instance of the `RxPubSub` class can be defined either by a constructor that specifies a `Subject` version or by the primary constructor that instantiated and passed the default `Subject` from the primary constructor. In addition to the private `Subject` field, there are two private collection fields: the `observers` collection and the subscribed `observables`. First, the `observers` collection maintains the state of the observers subscribed to the `Subject` through a new instance of the class `Subscription`. This class provides the unsubscribe method `Dispose` through the interface `IDisposable`, which then removes the specific observer when called. The second private collection is `observables`. Observables maintain a list of `IDisposable` interfaces, which originated from the registration of each observable by the `AddPublisher` method. Each observable can then be unregistered using the exposed `Dispose` method. In this implementation, the `Subject` is subscribed to the `TaskPoolScheduler` scheduler: ``` observable.SubscribeOn(TaskPoolScheduler.Default) ``` `TaskPoolScheduler` schedules the units of work for each observer to run in a different thread using the current provided `TaskFactory (`[`bit.ly/2vaemTA`](http://bit.ly/2vaemTA)`).` You can easily modify the code to accept any arbitrary scheduler. The subscribed observables from the internal `Subject` are exposed through the `IObservable` interface, obtained by calling the method `AsObservable`. This property is used to apply high-order operations against event notifications: ``` public IObservable<T> AsObservable() => subject.**AsObservable**(); ``` The reason to expose the `IObservable` interface on the `Subject` is to guarantee that no one can perform an upper cast back to an `ISubject` and mess things up. `Subject`s are stateful components, so it’s good practice to isolate access to them through encapsulation; otherwise, `Subject`s could be reinitialized or updated directly. ### 6.5.4 Analyzing tweet emotions using an Rx Pub-Sub class In Listing 6.7, you’ll use the C# Reactive `Pub-Sub` class (`RxPubSub)` to handle a stream of tweet emotions. The listing is another example of how simple it is to make the two programming languages C# and F# interoperable and allow them to coexist in the same solution. From the F# library implemented in section 6.4,*the observable that pushes a stream of tweet emotions is exposed so it’s easily subscribed to by external observers. (The observable commands are in bold.)* *Listing 6.7 Implementing observable tweet emotions ``` let tweetEmotionObservable(throttle:TimeSpan) = **Observable.Create**(fun (observer:IObserver<_>) -> ① let cred = new TwitterCredentials(consumerKey, consumerSecretKey, ➥ accessToken, accessTokenSecret) let stream = Stream.CreateSampleStream(cred) stream.FilterLevel <- StreamFilterLevel.Low stream.StartStreamAsync() |> ignore stream.TweetReceived |> **Observable**.throttle(throttle) |> **Observable**.filter(fun args -> args.Tweet.Language = Language.English) |> **Observable**.groupBy(fun args -> evaluateEmotion args.Tweet.FullText) |> **Observable**.selectMany(fun args -> args |> Observable.map(fun tw -> TweetEmotion.Create tw.Tweet args.Key)) |> Observable.subscribe(observer.OnNext) ② ) ``` The listing shows the implementation of `tweetEmotionObservable` using the observable `Create` factory operator. This operator accepts a function with an observer as its parameter, where the function behaves as an observable by calling its methods. The `Observable.Create` operator registers the observer passed into the function and starts to push notifications as they arrive. The observable is defined from the `subscribe` method, which pushes the notifications to the `observer` calling the method `OnNext`. The following listing shows the equivalent C# implementation of `tweetEmotionObservable` (in bold). Listing 6.8 Implementing `tweetEmotionObservable` in C# ``` var tweetObservable = Observable.**FromEventPattern**<TweetEventArgs>(stream, ➥ "TweetReceived"); **Observable.Create**<TweetEmotion>(observer => { var cred = new TwitterCredentials( **consumerKey, consumerSecretKey, accessToken, accessTokenSecret**); var stream = Stream.CreateSampleStream(cred); stream.FilterLevel = StreamFilterLevel.Low; stream.StartStreamAsync(); return **Observable**.FromEventPattern<TweetReceivedEventArgs>(stream, ➥ "TweetReceived") .Throttle(throttle) .Select(args => args.EventArgs) .Where(args => args.Tweet.Language == Language.English) .GroupBy(args => evaluateEmotion(args.Tweet.FullText)) .SelectMany(args => args.Select(tw => TweetEmotion.Create(tw.Tweet, args.Key))) .**Subscribe**(o=>observer.OnNext(o)); }); ``` The `FromEventPattern` method converts a .NET CLR event into an observable. In this case, it transforms the `TweetReceived` events into an `IObservable`. One difference between the C# and F# implementation is that the F# code doesn’t require creating an `Observable` `tweetObservable` using `FromEventPattern`. In fact, the event handler `TweetReceived` automatically becomes an observable in F# when passed into the pipeline `stream.TweetReceived |> Observable.` `TweetEmotion` is a value type (structure) that carries the information of the tweet emotion (in bold). Listing 6.9 `TweetEmotion` struct to maintain tweet details ``` [<Struct>] type **TweetEmotion**(tweet:ITweet, emotion:Emotion) = member this.Tweet with get() = tweet member this.Emotion with get() = emotion static member Create tweet emotion = **TweetEmotion**(tweet, emotion) ``` This next listing shows the implementation of `RxTweetEmotion`, which inherits from the `RxPubSub` class and subscribes an `IObservable` to manage the tweet emotion notifications (in bold). Listing 6.10 Implementing `RxPubSub``TweetEmotion` ``` class RxTweetEmotion : RxPubSub<TweetEmotion> ① { public RxTweetEmotion(TimeSpan throttle) ② { var obs = TweetsAnalysis.tweetEmotionObservable(throttle) .**SubscribeOn**(**TaskPoolScheduler**.Default); ③ base.AddPublisher(obs); } } ``` The class `RxTweetEmotion` creates and registers the `tweetEmotionObservable` observable to the base class using the `AddPublisher` method through the `obs` observable, which elevates the notification bubble from the internal `TweetReceived.` The next step, to accomplish something useful, is to register the observers. ### 6.5.5 Observers in action The implementation of the `RxTweetEmotion` class is completed. But without subscribing any observers, there’s no way to notify or react to an event when it occurs. To create an implementation of the `IObserver` interface, you could create a class that inherits and implements each of its methods. Fortunately, Rx has a set of helper functions to make this job easier. The method `Observer.Create()` can define new observers: ``` IObserver<T> Create<T>(Action<T> onNext, Action<Exception> onError, Action onCompleted) ``` This method has a series of overloads, passing an arbitrary implementation of the `OnNext`, `OnError`, and `OnCompleted` methods and returning an `IObserver<T>` object that calls the provided functions. These Rx helper functions minimize the number of types created in a program as well as unnecessary proliferation of classes. Here’s an example of an `IObserver` that prints only positive tweets to the console: ``` var tweetPositiveObserver **= Observer.Create**<TweetEmotion>(tweet => { if (tweet.Emotion.IsHappy) Console.WriteLine(tweet.Tweet.Text); }); ``` After creating the `tweetPositiveObserver` observer, its instance is registered to an instance of the previous implemented `RxTweetEmotion` class, which notifies each subscribed observer if a tweet with positive emotion is received: ``` var rxTweetEmotion = new RxTweetEmotion(TimeSpan.FromMilliseconds(150)); IDisposable posTweets = rxTweetEmotion.Subscribe(tweetPositiveObserver); ``` An instance of the `IDisposable` interface is returned for each observer subscribed. This interface can be used to stop the observer from receiving the notifications and to unregister (remove) the observer from the publisher by calling the `Dispose` method. ### 6.5.6 The convenient F# object expression The F# object expression is a convenient way to implement on the fly any instance of an anonymous object that’s based on a known existing interface (or interfaces). Object expressions in F# work similarly to the `Observer.Create()` method but can be applied to any given interface. Additionally, the instance created by an object expression in F# can also feed other .NET programming languages due to the supported interoperability. The following code shows how to use an object expression in F# to create an instance of `IObserver<TweetEmotion>` to display only unhappy emotions to the console: ``` let printUnhappyTweets() = { new IObserver<TweetEmotion> with member this.OnNext(tweet) = if tweet.Emotion = Unhappy then Console.WriteLine(tweet.Tweet.text) member this.OnCompleted() = () member this.OnError(exn) = () } ``` The aim of object expressions is to avoid the extra code required to define and create new named types. The instance resulting from the previous object expression can be used in the C# project by referencing the F# library and importing the correlated namespace. Here’s how you can use the F# object expression in the C# code: ``` IObserver<TweetEmotion> unhappyTweetObserver = printUnhappyTweets(); IDisposable disposable = rxTweetEmotion.Subscribe(unhappyTweetObserver); ``` An instance of the `unhappyTweetObserver` observer is defined using the F# object expression and is then subscribed to by `rxTweetEmotion`, which is now ready to receive notifications. ## Summary * The reactive programming paradigm employs non-blocking asynchronous operations with a high rate of event sequences over time. This programming paradigm focuses on listening and treating a series of events asynchronously as an event stream. * Rx treats an event stream as a sequence of events. Rx lets you exercise the same expressive programming semantic as LINQ and apply higher-order operations such as `filter`, `map`, and `reduce` against events. * Rx for .NET provides full support for multithreaded programming. In fact, Rx is capable of handling multiple events simultaneously, possibly in parallel. Moreover, it integrates with client programming, allowing GUI updates directly. * The Rx schedulers are designed to generate and process events concurrently, increasing responsiveness and scalability, while also reducing complexity. The Rx schedulers provide an abstraction over the concurrency model, which let you perform operations against moving data streams without the need to be exposed directly to the underlying concurrent implementation. * The programming language F# treats events as first-class values, which means you can pass them around as data. This approach is the root that influences event combinators that let you program against events as a regular sequence. * The special event combinators in F# can be exposed and consumed by other .NET programming languages, using this powerful programming style to simplify the traditional event-based programming model. * Reactive programming excels at taking full advantage of asynchronous execution in the creation of components and composition of workflows. Furthermore, the inclusion of Rx capabilities to tame backpressure is crucial to avoid overuse or unbounded consumption of resources. * Rx tames backpressure for continuous bursts of notifications, permitting you to control a high-rate stream of events that could potentially overwhelm consumers. * Rx provides a set of tools for implementing useful reactive patterns, such as Publish/Subscribe.**** ***# 7 Task-based functional parallelism **This chapter covers** * Task parallelism and declarative programming semantics * Composing parallel operations with functional combinators * Maximizing resource utilization with the Task Parallel Library * Implementing a parallel functional pipeline pattern The task parallelism paradigm splits program execution and runs each part in parallel by reducing the total runtime. This paradigm targets the distribution of tasks across different processors to maximize processor utilization and improve performance. Traditionally, to run a program in parallel, code is separated into distinct areas of functionality and then computed by different threads. In these scenarios, primitive locks are used to synchronize the access to shared resources in the presence of multiple threads. The purpose of locks is to avoid race conditions and memory corruption by ensuring concurrent mutual exclusion. The main reason locks are used is due to the design legacy of waiting for the current thread to complete before a resource is available to continue running the thread. A newer and better mechanism is to pass the rest of the computation to a callback function (which runs after the thread completes execution) to continue the work. This technique in FP is called *continuation-passing style (CPS)*. In this chapter, you’ll learn how to adopt this mechanism to run multiple tasks in parallel without blocking program execution. With this technique, you’ll also learn how to implement task-based parallel programs by isolating side effects and mastering function composition, which simplifies the achievement of task parallelism in your code. Because compositionality is one of the most important features in FP, it eases the adoption of a declarative programming style. Code that’s easy to understand is also simple to maintain. Using FP, you’ll engage task parallelism in your programs without introducing complexity, as compared to conventional programming. ## 7.1 A short introduction to task parallelism *Task* *parallelism* refers to the process of running a set of independent tasks in parallel across several processors. This paradigm partitions a computation into a set of smaller tasks and executes those smaller tasks on multiple threads. The execution time is reduced by simultaneously processing multiple functions. In general, parallel jobs begin from the same point, with the same data, and can either terminate in a fire-and-forget fashion or complete altogether in a task-group continuation. Any time a computer program simultaneously evaluates different and autonomous expressions using the same starting data, you have task parallelism. The core of this concept is based on small units of computations called *futures*. Figure 7.1 shows the comparison between data parallelism and task parallelism.  Figure 7.1 Data parallelism is the simultaneous execution of the same function across the elements of a data set. Task parallelism is the simultaneous execution of multiple and different functions across the same or different data sets. Task parallelism achieves its best performance by adjusting the number of running tasks, depending on the amount of parallelism available on your system, which corresponds to the number of available cores and, possibly, their current loads. ### 7.1.1 Why task parallelism and functional programming? In the previous chapters, you’ve seen code examples that deal with data parallelism and task composition. Those data-parallel patterns, such as Divide and Conquer, Fork/Join, and MapReduce, aim to solve the computational problem of splitting and computing in parallel smaller, independent jobs. Ultimately, when the jobs are terminated, their outputs are combined into the final result. In real-world parallel programming, however, you commonly deal with different and more complex structures that aren’t so easily split and reduced. For example, the computations of a task that processes input data could rely on the result of other tasks. In this case, the design and approach to coordinating the work among multiple tasks is different than for the data parallelism model and can sometimes be challenging. This challenge is due to task dependencies, which can reach convoluted connections where execution times can vary, making the job distribution tough to manage. The purpose of task parallelism is to tackle these scenarios, providing you, the developer, with a toolkit of practices, patterns, and, in the case of programming, the .NET Framework, a rich library that simplifies task-based parallel programming. In addition, FP eases the compositional aspect of tasks by controlling side effects and managing their dependencies in a declarative programming style. Functional paradigm tenets play an essential role in writing effective and deterministic task-based parallel programs. These functional concepts were discussed in the early chapters of this book. To summarize, here’s a list of recommendations for writing parallel code: * Tasks should evaluate side-effect-free functions, which lead to referential transparency and deterministic code. Pure functions make the program more predictable because the functions always behave in the same way, regardless of the external state. * Remember that pure functions can run in parallel because the order of execution is irrelevant. * If side effects are required, control them locally by performing the computation in a function with run-in isolation. * Avoid sharing data between tasks by applying a defensive copy approach. * Use immutable structures when data sharing between tasks cannot be avoided. ### 7.1.2 Task parallelism support in .NET Since its first release, the .NET Framework has supported the parallel execution of code through multithreading. Multithreaded programs are based on an independent execution unit called a *thread*, which is a lightweight process responsible for multitasking within a single application. (The `Thread` class can be found in the Base Class Library (BCL) `System.Threading` namespace.) Threads are handled by the CLR. The creation of new threads is expensive in terms of overhead and memory. For example, the memory stack size associated with the creation of a thread is about 1 MB in an x86 architecture-based processor because it involves the stack, thread local storage, and context switches. Fortunately, the .NET Framework provides a class `ThreadPool` that helps to overcome these performance problems. In fact, it’s capable of optimizing the costs associated with complex operations, such as creating, starting, and destroying threads. Furthermore, the .NET `ThreadPool` is designed to reuse existing threads as much as possible to minimize the costs associated with the instantiation of new ones. Figure 7.2 compares the two processes. The `ThreadPool` class exposes the static method `QueueUserWorkItem`, which accepts a function (delegate) that represents an asynchronous operation.  Figure 7.2 If using conventional threads, you must create an instance of a new thread for each operation or task. This can create memory consumption issues. By contrast, if using a thread pool, you queue a task in a pool of work items, which are lightweight compared to threads. The thread pool then schedules these tasks, reusing the thread for the next work item and returning it back to the pool when the job is completed. The following listing compares starting a thread in a traditional way versus starting a thread using the `ThreadPool`.`QueueUserWorkItem` static method. Listing 7.1 Spawning threads and `ThreadPool.QueueUserWorkItem` ``` Action<string> downloadSite = url => { ① var content = new WebClient().DownloadString(url); Console.WriteLine($"The size of the web site {url} is ➥ {content.Length}"); }; var threadA = new Thread(() => downloadSite("http://www.nasdaq.com")); var threadB = new Thread(() => downloadSite("http://www.bbc.com")); threadA.Start(); threadB.Start(); ② threadA.Join(); threadB.Join(); ② ThreadPool.QueueUserWorkItem(o => downloadSite("http://www.nasdaq.com")); ThreadPool.QueueUserWorkItem(o => downloadSite("http://www.bbc.com")); ③ ``` A thread starts explicitly, but the `Thread` class provides an option using the instance method `Join` to wait for the thread. Each thread then creates an additional memory load, which is harmful to the runtime environment. Initiating an asynchronous computation using `ThreadPool`’s `QueueUserWorkItem` is simple, but there are a few restraints when using this technique that introduce serious complications in developing a task-based parallel system: * No built-in notification mechanism when an asynchronous operation completes * No easy way to get back a result from a `ThreadPool` worker * No built-in mechanism to propagate exceptions to the original thread * No easy way to coordinate dependent asynchronous operations To overcome these limitations, Microsoft introduced the notion of tasks with the TPL, accessible through the `System.Threading.Tasks` namespace. The tasks concept is the recommended approach for building task-based parallel systems in .NET. ## 7.2 The .NET Task Parallel Library The .NET TPL implements a number of extra optimizations on top of `ThreadPool`, including a sophisticated `TaskScheduler` work-stealing algorithm ([`mng.bz/j4K1`](http://mng.bz/j4K1)) to scale dynamically the degree of concurrency, as shown in figure 7.3. This algorithm guarantees an effective use of the available system processor resources to maximize the overall performance of the concurrent code.  Figure 7.3 The TPL uses the work-stealing algorithm to optimize the scheduler. Initially, the TPL sends jobs to the main queue (step 1). Then it dispatches the work items to one of the worker threads, which has a private and dedicated queue of work items to process (step 2). If the main queue is empty, the workers look in the private queues of the other workers to “steal” work (step 3). With the introduction of the task concept in place of the traditional and limited thread model, the Microsoft TPL eases the process of adding concurrency and parallelism to a program with a set of new types. Furthermore, the TPL provides support through the `Task` object to cancel and manage state, to handle and propagate exceptions, and to control the execution of working threads. The TPL abstracts away the implementation details from the developer, offering control over executing the code in parallel. When using a task-based programming model, it becomes almost effortless to introduce parallelism in a program and concurrently execute parts of the code by converting those parts into tasks. You have several ways to invoke parallel tasks. This chapter reviews the relevant techniques to implement task parallelism. ### 7.2.1 Running operations in parallel with TPL Parallel.Invoke Using the .NET TPL, you can schedule a task in several ways, the `Parallel.Invoke` method being the simplest. This method accepts an arbitrary number of actions (delegates) as an argument in the form `ParamArray` and creates a task for each of the delegates passed. Unfortunately, the action-delegate signature has no input arguments, and it returns `void`, which is contrary to functional principles. In imperative programming languages, functions returning `void` are used for side effects. When all the tasks terminate, the `Parallel.Invoke` method returns control to the main thread to continue the execution flow. One important distinction of the `Parallel.Invoke` method is that exception handling, synchronous invocation, and scheduling are handled transparently to the developer. Let’s imagine a scenario where you need to execute a set of independent, heterogeneous tasks in parallel as a whole, then continue the work after all tasks complete. Unfortunately, PLINQ and parallel loops (discussed in the previous chapters) cannot be used because they don’t support heterogeneous operations. This is the typical case for using the `Parallel.Invoke` method. Listing 7.2 runs functions in parallel against three given images and then saves the result in the filesystem. Each function creates a locally defensive copy of the original image to avoid unwanted mutation. The code example is in F#; the same concept applies to all .NET programming languages. Listing 7.2 `Parallel.Invoke` executing multiple heterogeneous tasks ``` let convertImageTo3D (sourceImage:string) (destinationImage:string) = ① let bitmap = Bitmap.FromFile(sourceImage) :?> Bitmap ② let w,h = bitmap.Width, bitmap.Height for x in 20 .. (w-1) do for y in 0 .. (h-1) do ③ let c1 = bitmap.GetPixel(x,y) let c2 = bitmap.GetPixel(x - 20,y) let color3D = Color.FromArgb(int c1.R, int c2.G, int c2.B) bitmap.SetPixel(x - 20 ,y,color3D) bitmap.Save(destinationImage, ImageFormat.Jpeg) ④ let setGrayscale (sourceImage:string) (destinationImage:string) = ⑤ let bitmap = Bitmap.FromFile(sourceImage) :?> Bitmap ② let w,h = bitmap.Width, bitmap.Height for x = 0 to (w-1) do for y = 0 to (h-1) do ③ let c = bitmap.GetPixel(x,y) let gray = int(0.299 * float c.R + 0.587 * float c.G + 0.114 * ➥ float c.B) bitmap.SetPixel(x,y, Color.FromArgb(gray, gray, gray)) bitmap.Save(destinationImage, ImageFormat.Jpeg) ④ let setRedscale (sourceImage:string) (destinationImage:string) = ⑥ let bitmap = Bitmap.FromFile(sourceImage) :?> Bitmap ② let w,h = bitmap.Width, bitmap.Height for x = 0 to (w-1) do for y = 0 to (h-1) do ③ let c = bitmap.GetPixel(x,y) bitmap.SetPixel(x,y, Color.FromArgb(int c.R, ➥ abs(int c.G – 255), abs(int c.B – 255))) bitmap.Save(destinationImage, ImageFormat.Jpeg) ④ System.Threading.Tasks.**Parallel.Invoke**( Action(fun ()-> **convertImageTo3D** "MonaLisa.jpg" "MonaLisa3D.jpg"), Action(fun ()-> **setGrayscale** "LadyErmine.jpg" "LadyErmineRed.jpg"), Action(fun ()-> **setRedscale** "GinevraBenci.jpg" "GinevraBenciGray.jpg")) ``` In the code, `Parallel.Invoke` creates and starts the three tasks independently, one for each function, and blocks the execution flow of the main thread until all tasks complete. Due to the parallelism achieved, the total execution time coincides with the time to compute the slower method. It’s interesting to notice that the `Parallel.Invoke` method could be used to implement a Fork/Join pattern, where multiple operations run in parallel and then join when they’re all completed. Figure 7.4 shows the images before and after the image processing.  Figure 7.4 The resulting images from running the code in Listing 7.2. You can find the full implementation in the downloadable source code. Despite the convenience of executing multiple tasks in parallel, `Parallel.Invoke` limits the control of the parallel operation because of the `void` signature type. This method doesn’t expose any resources to provide details regarding the status and outcome, either succeed or fail, of each individual task. `Parallel.Invoke` can either complete successfully or throw an exception in the form of an `AggregateException` instance. In the latter case, any exception that occurs during the execution is postponed and rethrown when all tasks have completed. In FP, exceptions are side effects that should be avoided. Therefore, FP provides a better mechanism to handle errors, a subject which will be covered in chapter 11\. Ultimately, there are two important limitations to consider when using the `Parallel.Invoke` method: * The signature of the method returns `void`, which prevents compositionality. * The order of task execution isn’t guaranteed, which constrains the design of computations that have dependencies. ## 7.3 The problem of void in C# It’s common, in imperative programming languages such as C#, to define methods and delegates that don’t return values (`void`), such as the `Parallel.Invoke` method. This method’s signature prevents compositionality. Two functions can compose when the output of a function matches the input of the other function. In function-first programming languages such as F#, every function has a return value, including the case of the `unit` type, which is comparable to a `void` but is treated as a value, conceptually not much different from a Boolean or integer. `unit` is the type of any expression that lacks any other specific value. Think of functions used for printing to the screen. There’s nothing specific that needs to be returned, and therefore functions may return `unit` so that the code is still valid. This is the F# equivalent of C#’s `void`. The reason F# doesn’t use `void` is that every valid piece of code has a return type, whereas `void` is the absence of a return. Rather than the concept of `void`, a functional programmer thinks of `unit`. In F#, the `unit` type is written as `()`. This design enables function composition. In principle, it isn’t required for a programming language to support methods with return values. But a method without a defined output (`void`) suggests that the function performs some side effect, which makes it difficult to run tasks in parallel. ### 7.3.1 The solution for void in C#: the unit type In functional programming, a function defines a relationship between its input and output values. This is similar to the way mathematical theorems are written. For example, in the case of a pure function, the return value is only determined by its input values. In mathematics, every function returns a value. In FP, a function is a mapping, and a mapping has to have a value to map. This concept is missing in mainstream imperative programming languages such as C#, C++, and Java, which treat `void`s as methods that don’t return anything, instead of as functions that can return something meaningful. In C#, you can implement a `Unit` type as a `struct` with a single value that can be used as a return type in place of a `void`-returning method. Alternatively, the Rx, discussed in chapter 6, provides a `unit` type as part of its library. This listing shows the implementation of the `Unit` type in C#, which was borrowed from the Microsoft Rx ([`bit.ly/2vEzMeM`](http://bit.ly/2vEzMeM)). Listing 7.3 `Unit` type implementation in C# ``` public struct Unit : IEquatable<Unit> ① { public static readonly Unit Default = new Unit(); ② public override int GetHashCode() => 0; ③ public override bool Equals(object obj) => obj is Unit; ③ public override string ToString() => "()"; public bool Equals(Unit other) => true; ④ public static bool operator ==(Unit lhs, Unit rhs) => true; ④ public static bool operator !=(Unit lhs, Unit rhs) => false; ④ } ``` The `Unit` struct implements the `IEquatable` interface in such a way that forces all values of the `Unit` type to be equal. But what’s the real benefit of having the `Unit` type as a value in a language type system? What is its practical use? Here are two main answers: * The type `Unit` can be used to publish an acknowledgment that a function is completed. * Having a `Unit` type is useful for writing generic code, including where a generic first-class function is required, which reduces code duplication. Using the `Unit` type, for example, you could avoid repeating code to implement `Action<T>` or `Func<T, R>`, or functions that return a `Task` or a `Task<T>`. Let’s consider a function that runs a `Task<TInput>` and transforms the result of the computation into a `TResult` type: ``` TResult Compute<TInput, TResult>(Task<TInput> task, Func<TInput, TResult> projection) => projection(task.Result); Task<int> task = Task.Run<int>(() => 42); bool isTheAnswerOfLife = Compute(task, n => n == 42); ``` This function has two arguments. The first is a `Task<TInput>` that evaluates to an expression. The result is passed into the second argument, a `Func<TInput, TResult>` delegate, to apply a transformation and then return the final value. How would you convert the `Compute` function into a function that prints the result? You’re forced to write a new function to replace the `Func<T>` delegate projection into an `Action` delegate type. The new method has this signature: ``` void Compute<TInput>(Task<TInput> task, Action<TInput> action) => action(task.Result); Task<int> task = Task.Run<int>(() => 42); Compute(task, n => Console.WriteLine($"Is {n} the answer of life? ➥ {n == 42}")); ``` It’s also important to point out that the `Action` delegate type is performing a side effect: in this case, printing the result on the console, which is a function conceptually similar to the previous one. It would be ideal to reuse the same function instead of having to duplicate code for the function with the `Action` delegate type as an argument. To do so, you’ll need to pass a `void` into the `Func` delegate, which isn’t possible in C#. This is the case where the `Unit` type removes code repetition. By using the `struct Unit` type definition, you can use the same function that takes a `Func` delegate to also produce the same behavior as the function with the `Action` delegate type: ``` Task<int> task = Task.Run<int>(() => 42); Unit unit = Compute(task, n => { Console.WriteLine($"Is {n} the answer of life? {n == 42}"); return Unit.Default;}); ``` In that way, introducing the `Unit` type in the C# language, you can write one `Compute` function to handle both cases of returning a value or computing a side effect. Ultimately, a function returning a `Unit` type indicates the presence of side effects, which is meaningful information for writing concurrent code. Moreover, there are FP languages, such as Haskell, where the `Unit` type notifies the compiler, which then distinguishes between pure and impure functions to apply more granular optimization. ## 7.4 Continuation-passing style: a functional control flow Task continuation is based on the functional idea of the CPS paradigm, discussed in chapter 3\. This approach gives you execution control, in the form of continuation, by passing the result of the current function to the next one. Essentially, function continuation is a delegate that represents “what happens next.” CPS is an alternative for the conventional control flow in imperative programming style, where each command is executed one after another. Instead, using CPS, a function is passed as an argument into a method, explicitly defining the next operation to execute after its own computation is completed. This lets you design your own flow-of-control commands. ### 7.4.1 Why exploit CPS? The main benefit of applying CPS in a concurrent environment is avoiding inconvenient thread blocking that negatively impacts the performance of the program. For example, it’s inefficient for a method to wait for one or more tasks to complete, blocking the main execution thread until its child tasks complete. Often the parent task, which in this case is the main thread, can continue, but cannot proceed immediately because its thread is still executing one of the other tasks. The solution: CPS, which allows the thread to return to the caller immediately, without waiting on its children. This ensures that the continuation will be invoked when it completes. One downside of using explicit CPS is that code complexity can escalate quickly because CPS makes programs longer and less readable. You’ll see later in this chapter how to combat this issue by combining TPL and functional paradigms to abstract the complexity behind the code, making it flexible and simple to use. CPS enables several helpful task advantages: * Function continuations can be composed as a chain of operations. * A continuation can specify the conditions under which the function is called. * A continuation function can invoke a set of other continuations. * A continuation function can be canceled easily at any time during computation or even before it starts. In the .NET Framework, a task is an abstraction of the classic (traditional) .NET thread ([`mng.bz/DK6K`](http://mng.bz/DK6K)), representing an independent asynchronous unit of work. The `Task` object is part of the `System.Threading.Tasks` namespace. The higher level of abstraction provided by the `Task` type aims to simplify the implementation of concurrent code and facilitate the control of the life cycle for each task operation. It’s possible, for example, to verify the status of the computation and confirm whether the operation is terminated, failed, or canceled. Moreover, tasks are composable in a chain of operations by using continuations, which permit a declarative and fluent programming style. The following listing shows how to create and run operations using the `Task` type. The code uses the functions from Listing 7.2. Listing 7.4 Creating and starting tasks ``` Task monaLisaTask = Task.Factory.StartNew(() => convertImageTo3D("MonaLisa.jpg", "MonaLisa3D.jpg")); ① Task ladyErmineTask = new Task(() => setGrayscale("LadyErmine.jpg", "LadyErmine3D.jpg")); ladyErmineTask.Start(); ② Task ginevraBenciTask = Task.Run(() => setRedscale("GinevraBenci.jpg", "GinevraBenci3D.jpg")); ③ ``` The code shows three different ways to create and execute a task: * The first technique creates and immediately starts a new task using the built-in `Task.Factory.StartNew` method constructor. * The second technique creates a new instance of a task, which needs a function as a constructor parameter to serve as the body of the task. Then, calling the `Start` instance method, the `Task` begins the computation. This technique provides the flexibility to delay task execution until the `Start` function is called; in this way, the `Task` object can be passed into another method that decides when to schedule the task for execution. * The third approach creates the `Task` object and then immediately calls the `Run` method to schedule the task. This is a convenient way to create and work with tasks using the default constructor that applies the standard option values. The first two options are a better choice if you need a particular option to instantiate a task, such as setting the `LongRunning` option. In general, tasks promote a natural way to isolate data that depends on functions to communicate with their related input and output values, as shown in the conceptual example in figure 7.5.  Figure 7.5 When two tasks are composed together, the output of the first task becomes the input for the second. This is the same as function composition. ### 7.4.2 Waiting for a task to complete: the continuation model You’ve seen how to use tasks to parallelize independent units of work. But in common cases the structure of the code is more complex than launching operations in a fire-and-forget manner. The majority of task-based parallel computations require a more sophisticated level of coordination between concurrent operations, where order of execution can be influenced by the underlying algorithms and control flow of the program. Fortunately, the .NET TPL library provides mechanisms for coordinating tasks. Let’s start with an example of multiple operations running sequentially, and incrementally redesign and refactor the program to improve the code compositionality and performance. You’ll start with the sequential implementation, and then you’ll apply different techniques incrementally to improve and maximize the overall computational performance. Listing 7.5 implements a face-detection program that can detect specific faces in a given image. For this example, you’ll use the images of the presidents of the United States on $20, $50, and $100 bills, using the side on which the president’s image is printed. The program will detect the face of the president in each image and return a new image with a square box surrounding the detected face. In this example, focus on the important code without being distracted by the details of the UI implementation. The full source code is downloadable from the book’s website. Listing 7.5 Face-detection function in C# ``` Bitmap DetectFaces(string fileName) { var imageFrame = new Image<Bgr, byte>(fileName); ① var cascadeClassifier = new CascadeClassifier(); ② var grayframe = imageFrame.Convert<Gray, byte>(); ① var faces = cascadeClassifier.DetectMultiScale( grayframe, 1.1, 3, System.Drawing.Size.Empty); ③ foreach (var face in faces) imageFrame.Draw(face, new Bgr(System.Drawing.Color.BurlyWood), 3); ④ return imageFrame.ToBitmap(); } void StartFaceDetection(string imagesFolder) { var filePaths = Directory.GetFiles(imagesFolder); foreach (string filePath in filePaths) { var bitmap = DetectFaces(filePath); var bitmapImage = bitmap.ToBitmapImage(); Images.Add(bitmapImage); ⑤ } } ``` The function `DetectFaces` loads an image from the filesystem using the given filename path and then detects the presence of any faces. The library `Emgu.CV` is responsible for performing the face detection. The `Emgu.CV` library is a .NET wrapper that permits interoperability with programming languages such as C# and F#, both of which can interact and call the functions of the underlying Intel OpenCV image-processing library.^(1) The function `StartFaceDetection` initiates the execution, getting the filesystem path of the images to evaluate, and then sequentially processes the face detection in a `for-each` loop by calling the function `DetectFaces`. The result is a new `BitmapImage` type, which is added to the observable collection `Images` to update the UI. Figure 7.6 shows the expected result—the detected faces are highlighted in a box.  Figure 7.6 Result of the face-detection process. The right side has the images with the detected face surrounded by a box frame. The first step in improving the performance of the program is to run the face-detection function in parallel, creating a new task for each image to evaluate. Listing 7.6 Parallel-task implementation of the face-detection program ``` void StartFaceDetection(string imagesFolder) { var filePaths = Directory.GetFiles(imagesFolder); var bitmaps = from filePath in filePaths select Task.Run<Bitmap>(() => DetectFaces(filePath)); ① foreach (var bitmap in bitmaps) { var bitmapImage = bitmap.Result; Images.Add(bitmapImage.ToBitmapImage()); } } ``` In this code, a LINQ expression creates an `IEnumerable` of `Task<Bitmap>`, which is constructed with the convenient `Task.Run` method. With a collection of tasks in place, the code starts an independent computation in the `for-each` loop; but the performance of the program isn’t improved. The problem is that the tasks still run sequentially, one by one. The loop processes one task at a time, awaiting its completion before continuing to the next task. The code isn’t running in parallel. You could argue that choosing a different approach, such as using `Parallel.ForEach` or `Parallel.Invoke` to compute the `DetectFaces` function, could avoid the problem and guarantee parallelism. But you’ll see why this isn’t a good idea. Let’s adjust the design to fix the problem by analyzing what the foundational issue is. The `IEnumerable` of `Task<Bitmap>` generated by the LINQ expression is materialized during the execution of the `for-each` loop. During each iteration, a `Task<Bitmap>` is retrieved, but at this point, the task isn’t competed; in fact, it’s not even started. The reason lies in the fact that the `IEnumerable` collection is lazily evaluated, so the underlying task starts the computation at the last possible moment during its materialization. Consequently, when the result of the task bitmap inside the loop is accessed through the `Task<Bitmap>.Result` property, the task will block the joining thread until the task is done. The execution will resume after the task terminates the computation and returns the result. To write scalable software, you can’t have any blocked threads. In the previous code, when the task’s `Result` property is accessed because the task hasn’t yet finished running, the thread pool will most likely create a new thread. This increases resource consumption and hurts performance. After this analysis, it appears that there are two issues to be corrected to ensure parallelism (figure 7.7): * Ensure that the tasks run in parallel. * Avoid blocking the main working thread and waiting for each task to complete.  Figure 7.7 The images are sent to the task scheduler, becoming work items to be processed (step 1). Work item 3 and work item 1 are then “stolen” by worker 1 and worker 2, respectively (step 2). Worker 1 completes the work and notifies the task scheduler, which schedules the rest of the work for continuation in the form of the new work item 4, which is the continuation of work item 3 (step 3). When work item 4 is processed, the result updates the UI (step 4). Here is how to fix issues to ensure the code runs in parallel and reduces memory consumption. Listing 7.7 Correct parallel-task implementation of the `DetectFaces` function ``` ThreadLocal<CascadeClassifier> CascadeClassifierThreadLocal = new ThreadLocal<CascadeClassifier>(() => new CascadeClassifier()); ① Bitmap DetectFaces(string fileName) { var imageFrame = new Image<Bgr, byte>(fileName); var cascadeClassifier = CascadeClassifierThreadLocal.Value; var grayframe = imageFrame.Convert<Gray, byte>(); var faces = cascadeClassifier.DetectMultiScale(grayframe, 1.1, 3, ➥ System.Drawing.Size.Empty); foreach (var face in faces) imageFrame.Draw(face, new Bgr(System.Drawing.Color.BurlyWood), 3); return imageFrame.ToBitmap(); } void StartFaceDetection(string imagesFolder) { var filePaths = Directory.GetFiles(imagesFolder); var bitmapTasks = (from filePath in filePaths select Task.Run<Bitmap>(() => DetectFaces(filePath))).ToList(); ② foreach (var bitmapTask in bitmapTasks) bitmapTask.ContinueWith(bitmap => { ③ var bitmapImage = bitmap.Result; Images.Add(bitmapImage.ToBitmapImage()); }, TaskScheduler.FromCurrentSynchronizationContext()); ④ } ``` In the example, to keep the code structure simple, there’s the assumption that each computation completes successfully. A few code changes exist, but the good news is that true parallel computation is achieved without blocking any threads (by continuing the task operation when it completes). The main function `StartFaceDetection` guarantees executing the tasks in parallel by materializing the LINQ expression immediately with a call to `ToList()` on the `IEnumerable` of `Task<Bitmap>`. Next, a `ThreadLocal` object is used to create a defensive copy of `CascadeClassifier` for each thread accessing the function `DetectFaces`. `CascadeClassifier` loads into memory a local resource, which isn’t thread safe. To solve this problem of thread unsafety, a local variable `CascadeClassifier` is instantiated for each thread that runs the function. This is the purpose of the `ThreadLocal` object (discussed in detail in chapter 4). Then, in the function `StartFaceDetection`, the `for-each` loop iterates through the list of `Task<Bitmap>`, creating a continuation for each task instead of blocking the execution if the task is not completed. Because `bitmapTask` is an asynchronous operation, there’s no guarantee that the task has completed executing before the `Result` property is accessed. It’s good practice to use task continuation with the function `ContinueWith` to access the result as part of a continuation. Defining a task continuation is similar to creating a regular task, but the function passed with the `ContinueWith` method takes as an argument a type of `Task<Bitmap>`. This argument represents the antecedent task, which can be used to inspect the status of the computation and branch accordingly. When the antecedent task completes, the function `ContinueWith` starts execution as a new task. Task continuation runs in the captured current synchronization context, `TaskScheduler.FromCurrentSynchronizationContext`, which automatically chooses the appropriate context to schedule work on the relevant UI thread. As previously mentioned, you could have used `Parallel.ForEach`, but the problem is that this approach waits until all the operations have finished before continuing, blocking the main thread. Moreover, it makes it more complex to update the UI directly because the operations run in different threads. ## 7.5 Strategies for composing task operations Continuations are the real power of the TPL. It’s possible, for example, to execute multiple continuations for a single task and to create a chain of task continuations that maintains dependencies with each other. Moreover, using task continuation, the underlying scheduler can take full advantage of the work-stealing mechanism and optimize the scheduling mechanisms based on the available resources at runtime. Let’s use task continuation in the face-detection example. The final code runs in parallel, providing a boost in performance. But the program can be further optimized in terms of scalability. The function `DetectFaces` sequentially performs the series of operations as a chain of computations. To improve resource use and overall performance, a better design is to split the tasks and subsequent task continuations for each `DetectFaces` operation run in a different thread. Using task continuation, this change is simple. The following listing shows a new `DetectFaces` function, with each step of the face-detection algorithm running in a dedicated and independent task. Listing 7.8 `DetectFaces` function using task continuation ``` Task<Bitmap> DetectFaces(string fileName) { var imageTask = Task.Run<Image<Bgr, byte>>( () => new Image<Bgr, byte>(fileName) ); var imageFrameTask = imageTask.ContinueWith( ① image => image.Result.Convert<Gray, byte>() ); var grayframeTask = imageFrameTask.ContinueWith( ① imageFrame => imageFrame.Result.Convert<Gray, byte>() ); var facesTask = grayframeTask.ContinueWith(grayFrame => ① { var cascadeClassifier = CascadeClassifierThreadLocal.Value; return cascadeClassifier.DetectMultiScale( grayFrame.Result, 1.1, 3, System.Drawing.Size.Empty); } ); var bitmapTask = facesTask.ContinueWith(faces => ① { foreach (var face in faces.Result) imageTask.Result.Draw( face, new Bgr(System.Drawing.Color.BurlyWood), 3); return imageTask.Result.ToBitmap(); } ); return bitmapTask; } ``` The code works as expected; the execution time isn’t enhanced, although the program can potentially handle a larger number of images to process while still maintaining lower resource consumption. This is due to the smart `TaskScheduler` optimization. Because of this, the code has become cumbersome and hard to change. For example, if you add error handling or cancellation support, the code becomes a pile of spaghetti code—hard to understand and to maintain. It can be better. Composition is the key to controlling complexity in software. The objective is to be able to apply a LINQ-style semantic to compose the functions that run the face-detection program, as shown here (the command and module names to note are in bold): ``` **from** image **in** **Task**.Run<Emgu.CV.Image<Bgr, byte>() **from** imageFrame **in** **Task**.Run<Emgu.CV.Image<Gray, byte>>() **from** faces **in** **Task**.Run<System.Drawing.Rectangle[]>() **select** faces; ``` This is an example of how mathematical patterns can help to exploit declarative compositional semantics. ### 7.5.1 Using mathematical patterns for better composition Task continuation provides support to enable task composition. How do you combine tasks? In general, function composition takes two functions and injects the result from the first function into the input of the second function, thereby forming one function. In chapter 2, you implemented this `Compose` function in C# (in bold): ``` Func<A, C> **Compose**<A, B, C>(this Func<A, B> f, Func<B, C> g) => (n) => g(f(n)); ``` Can you use this function to combine two tasks? Not directly, no. First, the return type of the compositional function should be exposing the task’s elevated type as follows (noted in bold): ``` Func<A, **Task**<C>> **Compose**<A, B, C>(this Func<A, **Task**<B>> f, Func<B, **Task**<C>> g) => (n) => g(f(n)); ``` But there’s a problem: the code doesn’t compile. The return type from the function `f` doesn’t match the input of the function `g`: the function `f(n)` returns a type `Task<B>`, which isn’t compatible with the type `B` in function `g`. The solution is to implement a function that accesses the underlying value of the elevated type (in this case, the task) and then passes the value into the next function. This is a common pattern, called Monad, in FP; the Monad pattern is another design pattern, like the Decorator and Adapter patterns. This concept was introduced in section 6.4.1, but let’s analyze this idea further so you can apply the concept to improve the face-detection code. Monads are mathematical patterns that control the execution of side effects by encapsulating program logic, maintaining functional purity, and providing a powerful compositional tool to combine computations that work with elevated types. According to the monad definition, to define a monadic constructor, there are two functions, `Bind` and `Return`, to implement. #### The monadic operators, bind and return `Bind` takes an instance of an elevated type, unwraps the underlying value, and then invokes the function over the extracted value, returning a new elevated type. This function is performed in the future when it’s needed. Here the `Bind` signature uses the `Task` object as an elevated type: ``` Task<R> Bind<T, R>(this Task<T> m, Func<T, Task<R>> k) ``` The `Return` value is an operator that wraps any type `T` into an instance of the elevated type. Following the example of the `Task` type, here’s the signature: ``` Task<T> Return(T value) ``` #### The monad laws Ultimately, to define a correct monad, the `Bind` and `Return` operations need to satisfy the monad laws: 1. *Left identity*—Applying the `Bind` operation to a value wrapped by the `Return` operation and then passed into a function is the same as passing the value straight into the function: ``` Bind(Return value, function) = function(value) ``` 1. *Right identity*—Returning a bind-wrapped value is equal to the wrapped value directly: ``` Bind(elevated-value, Return) = elevated-value ``` 1. *Associative*—Passing a value into a function `f` whose result is passed into a second function `g` is the same as composing the two functions `f` and `g` and then passing the initial value: ``` Bind(elevated-value, f(Bind(g(elevated-value)) = Bind(elevated-value, Bind(f.Compose(g), elevated-value)) ``` Now, using these monadic operations, you can fix the error in the previous `Compose` function to combine the `Task` elevated types as shown here: ``` Func<A, Task<C>> Compose<A, B, C>(this Func<A, Task<B>> f, Func<B, Task<C>> g) => (n) => Bind(f(n), g); ``` Monads are powerful because they can represent any arbitrary operations against elevated types. In the case of the `Task` elevated type, monads let you implement function combinators to compose asynchronous operations in many ways, as shown in figure 7.8.  Figure 7.8 The monadic `Bind` operator takes the elevated value `Task` that acts as a container (wrapper) for the value 42, and then it applies the function `x` ➔ `Task<int>(x => x + 1)`, where `x` is the number 41 unwrapped. Basically, the `Bind` operator unwraps an elevated value (`Task<int>(41)`) and then applies a function (`x + 1`) to return a new elevated value (`Task<int>(42`). Surprisingly, these monadic operators are already built into the .NET Framework in the form of LINQ operators. The LINQ `SelectMany` definition corresponds directly to the monadic `Bind` function. Listing 7.9 shows both the `Bind` and `Return` operators applied to the `Task` type. The functions are then used to implement a LINQ-style semantic to compose asynchronous operations in a monadic fashion. The code is in F# and then consumed in C# to keep proving the easy interoperability between these programming languages (the code to note is in bold). Listing 7.9 Task extension in F# to enable LINQ-style operators for tasks ``` [<Sealed; Extension; CompiledName("Task")>] type TaskExtensions = // 'T -> M<'T> static member **Return** value : Task<'T> = Task.FromResult<'T> (value) ① // M<'T> * ('T -> M<'U>) -> M<'U> static member **Bind** (input : Task<'T>, binder : 'T -> Task<'U>) = ② let tcs = new TaskCompletionSource<'U>() ③ input.ContinueWith(fun (task:Task<'T>) -> if (task.IsFaulted) then tcs.SetException(task.Exception.InnerExceptions) elif (task.IsCanceled) then tcs.SetCanceled() else try (binder(task.Result)).ContinueWith(fun ➥ (nextTask:Task<'U>) -> tcs.SetResult(nextTask.Result)) |> ignore ④ with | ex -> tcs.SetException(ex)) |> ignore tcs.Task static member **Select** (task : Task<'T>, selector : 'T -> 'U) : Task<'U> = task.ContinueWith(fun (t:Task<'T>) -> selector(t.Result)) static member **SelectMany**(input:Task<'T>, binder:'T -> Task<'I>, projection:'T -> 'I -> 'R): Task<'R> = TaskExtensions.Bind(input, fun outer -> TaskExtensions.Bind(binder(outer), fun inner -> TaskExtensions.Return(projection outer inner))) ⑤ static member **SelectMany**(input:Task<'T>, binder:'T -> Task<'R>) : Task<'R> = TaskExtensions.Bind(input, fun outer -> TaskExtensions.Bind(binder(outer), fun inner -> TaskExtensions.Return(inner))) ⑤ ``` The implementation of the `Return` operation is straightforward, but the `Bind` operation is a little more complex. The `Bind` definition can be reused to create other LINQ-style combinators for tasks, such as the `Select` and two variants of the `SelectMany` operators. In the body of the function `Bind`, the function `ContinueWith`, from the underlying task instance, is used to extract the result from the computation of the input task. Then to continue the work, it applies the binder function to the result of the input `task.` Ultimately, the output of the `nextTask` continuation is set as the result of the `tcs` `TaskCompletionSource`. The returning task is an instance of the underlying `TaskCompletionSource`, which is introduced to initialize a task from any operation that starts and finishes in the future. The idea of the `TaskCompletionSource` is to create a task that can be governed and updated manually to indicate when and how a given operation completes. The power of the `TaskCompletionSource` type is in the capability of creating tasks that don’t tie up threads. #### Applying the monad pattern to task operations With the LINQ operations `SelectMany` on tasks in place, you can rewrite the `DetectFaces` function using an expressive and comprehension query (the code to note is in bold). Listing 7.10 `DetectFaces` using task continuation based on a LINQ expression ``` **Task<Bitmap>** DetectFaces(string fileName) { Func<System.Drawing.Rectangle[],Image<Bgr, byte>, Bitmap> ➥ drawBoundries = (faces, image) => { faces.ForAll(face => image.Draw(face, new ➥ Bgr(System.Drawing.Color.BurlyWood), 3)); ① return image.ToBitmap(); }; return **from** image **in** **Task.Run**(() => new Image<Bgr, byte>(fileName)) **from** imageFrame **in** **Task.Run**(() => image.Convert<Gray, byte>()) **from** bitmap **in** **Task.Run**(() => CascadeClassifierThreadLocal.Value.DetectMultiScale(imageFrame, ➥ 1.1, 3, System.Drawing.Size.Empty)).Select(faces => drawBoundries(faces, image)) **select** bitmap; ② } ``` This code shows the power of the monadic pattern, providing composition semantics over elevated types such as tasks. Moreover, the code of the monadic operations is concentrated into the two operators `Bind` and `Return`, making the code maintainable and easy to debug. To add logging functionality or special error handling, for example, you only need to change one place in code, which is convenient. In Listing 7.10, the `Return` and `Bind` operators were exposed in F# and consumed in C#, as a demonstration of the simple interoperability between the two programming languages. The source code for this book contains the implementation in C#. A beautiful composition of elevated types requires monads; the *continuation monad* shows how monads can readily express complex computations. #### Using the hidden fmap functor pattern to apply transformation One important function in FP is `Map`, which transforms one input type into a different one. The signature of the `Map` function is ``` Map : (T -> R) -> [T] -> [R] ``` An example in C# is the LINQ `Select` operator, which is a map function for `IEnumerable` types: ``` IEnumerable<R> Select<T,R>(IEnumerable<T> en, Func<T, R> projection) ``` In FP, this similar concept is called a *functor*, and the map function is defined as `fmap`. Functors are basically types that can be mapped over. In F#, there are many: ``` Seq.map : ('a -> 'b) -> 'a seq -> 'b seq List.map : ('a -> 'b) -> 'a list -> 'b list Array.map : ('a -> 'b) -> 'a [] -> 'b [] Option.map : ('a -> 'b) -> 'a Option -> 'b Option ``` This mapping idea seems simple, but the complexity starts when you have to map elevated types. This is when the functor pattern becomes useful. Think about a functor as a container that wraps an elevated type and offers a way to transform a normal function into one that operates on the contained values. In the case of the `Task` type, this is the signature: ``` fmap : ('T -> 'R) -> Task<'T> -> Task<'R> ``` This function has been previously implemented for the `Task` type in the form of the `Select` operator as part of the LINQ-style operators set for tasks built in F#. In the last LINQ expression computation of the function `DetectFaces`, the `Select` operator projects (`map`) the input `Task<Rectangle[]>` into a `Task<Bitmap>`: ``` from image in Task.Run(() => new Image<Bgr, byte>(fileName)) from imageFrame in Task.Run(() => image.Convert<Gray, byte>()) from bitmap in Task.Run(() => CascadeClassifierThreadLocal.Value.DetectMultiScale (imageFrame, 1.1, 3, System.Drawing.Size.Empty)) .select(faces => drawBoundries(faces, image)) select bitmap; ``` The concept of functors becomes useful when working with another functional pattern, applicative*functors, which will be covered in chapter 10.* *#### The abilities behind monads Monads provide an elegant solution to composing elevated types. Monads aim to control functions with side effects, such as those that perform I/O operations, providing a mechanism to perform operations directly on the result of the I/O without having a value from impure functions floating around the rest of your pure program. For this reason, monads are useful in designing and implementing concurrent applications. ### 7.5.2 Guidelines for using tasks Here are several guidelines for using tasks: * It’s good practice to use immutable types for return values. This makes it easier to ensure that your code is correct. * It’s good practice to avoid tasks that produce side effects; instead, tasks should communicate with the rest of the program only with their returned values. * It’s recommended that you use the task continuation model to continue with the computation, which avoids unnecessary blocking. ## 7.6 The parallel functional Pipeline pattern In this section, you’re going to implement one of the most common coordination techniques—the Pipeline pattern. In general, a pipeline is composed of a series of computational steps, composed as a chain of stages, where each stage depends on the output of its predecessor and usually performs a transformation on the input data. You can think of the Pipeline pattern as an assembly line in a factory, where each item is constructed in stages. The evolution of an entire chain is expressed as a function, and it uses a message queue to execute the function each time new input is received. The message queue is non-blocking because it runs in a separate thread, so even if the stages of the pipeline take a while to execute, it won’t block the sender of the input from pushing more data to the chain. This pattern is similar to the Producer/Consumer pattern, where a producer manages one or more worker threads to generate data. There can be one or more consumers that consume the data being created by the producer. Pipelines allow these series to run in parallel. The implementation of the pipeline in this section follows a slightly different design as compared to the traditional one seen in figure 7.9. The traditional Pipeline pattern with serial stages has a speedup, measured in throughput, which is limited to the throughput of the slowest stage. Every item pushed into the pipeline must pass through that stage. The traditional Pipeline pattern cannot scale automatically with the number of cores, but is limited to the number of stages. Only a linear pipeline, where the number of stages matches the number of available logical cores, can take full advantage of the computer power. In a computer with eight cores, a pipeline composed of four stages can use only half of the resources, leaving 50% of the cores idle. FP promotes composition, which is the concept the Pipeline pattern is based on. In Listing 7.11, the pipeline embraces this tenet by composing each step into a single function and then distributing the work in parallel, fully using the available resources. In an abstract way, each function acts as the continuation of the previous one, behaving as a continuation-passing style. The code listing implementing the pipeline is in F#, then consumed in C#. But in the downloadable source code, you can find the full implementation in both programming languages. Here the `IPipeline` interface defines the functionality of the pipeline.  Figure 7.9 The traditional pipeline creates a buffer between each stage that works as a parallel Producer/Consumer pattern. There are almost as many buffers as there are number of stages. With this design, each work item to process is sent to the initial stage, then the result is passed into the first buffer, which coordinates the work in parallel to push it into the second stage. This process continues until the end of the pipeline when all the stages are computed. By contrast, the functional parallel pipeline combines all the stages into one, as if composing multiple functions. Then, using a `Task` object, each work item is pushed into the combined steps to be processed in parallel and uses the TPL and the optimized scheduler. Listing 7.11 `IPipeline` interface ``` [<Interface>] type IPipeline<'a,'b> = ① abstract member Then : Func<'b, 'c> -> IPipeline<'a,'c> ② abstract member Enqueue : 'a * Func<('a * 'b), unit)> -> unit ③ abstract member Execute : (int * CancellationToken) -> IDisposable ④ abstract member Stop : unit -> unit ⑤ ``` The function `Then` is the core of the pipeline, where the input function is composed of the previous one, applying a transformation. This function returns a new instance of the pipeline, providing a convenient and fluent API to build the process. The `Enqueue` function is responsible for pushing work items into the pipeline for processing. It takes a `Callback` as an argument, which is applied at the end of the pipeline to further process the final result. This design gives flexibility to apply any arbitrary function for each item pushed. The `Execute` function starts the computation. Its input arguments set the size of the internal buffer and a cancellation token to stop the pipeline on demand. This function returns an `IDisposable` type, which can be used to trigger the cancellation token to stop the pipeline. Here is the full implementation of the pipeline (the code to note is in bold). Listing 7.12 Parallel functional pipeline pattern ``` [<Struct>] type Continuation<'a, 'b>(input:'a, callback:Func<('a * 'b), unit) = member this.Input with get() = input member this.Callback with get() = callback ① type Pipeline<'a, 'b> private (func:Func<'a, 'b>) as this = let continuations = Array.init 3 (fun _ -> new BlockingCollection<Continuation<'a,'b>>(100)) ② let then' (nextFunction:Func<'b,'c>) = Pipeline(func.**Compose**(nextFunction)) :> IPipeline<_,_> ③ let enqueue (input:'a) (callback:Func<('a * 'b), unit>) = BlockingCollection<Continuation<_,_>>.AddToAny(continuations, ➥ Continuation(input, callback)) ④ let stop() = for continuation in continuations do continuation.CompleteAdding() ⑤ let execute blockingCollectionPoolSize (cancellationToken:CancellationToken) = cancellationToken.Register(Action(stop)) |> ignore ⑥ for i = 0 to blockingCollectionPoolSize - 1 do Task.Factory.StartNew(fun ( )-> ⑦ while (not <| continuations.All(fun bc -> bc.IsCompleted)) && (not <| cancellationToken.IsCancellationRequested) do let continuation = ref ➥ Unchecked.defaultof<Continuation<_,_>> BlockingCollection.TakeFromAny(continuations, ➥ continuation) let continuation = continuation.Value continuation.Callback.Invoke(continuation.Input, ➥ func.Invoke(continuation.Input)), cancellationToken, TaskCreationOptions.LongRunning, ➥ TaskScheduler.Default) |> ignore static member Create(func:Func<'a, 'b>) = Pipeline(func) :> IPipeline<_,_> ⑧ interface IPipeline<'a, 'b> with member this.Then(nextFunction) = then' nextFunction member this.Enqueue(input, callback) = enqueue input callback member this.Stop() = stop() member this.Execute (blockingCollectionPoolSize,cancellationToken) = execute blockingCollectionPoolSize cancellationToken { new IDisposable with member self.Dispose() = stop() } ``` The `Continuation` structure is used internally to pass through the pipeline functions to compute the items. The implementation of the pipeline uses an internal buffer composed by an array of the concurrent collection `BlockingCollection<Collection>`, which ensures thread safety during parallel computation of the items. The argument to this collection constructor specifies the maximum number of items to buffer at any given time. In this case, the value is `100` for each buffer. Each item pushed into the pipeline is added to the collection, which in the future will be processed in parallel. The `Then` function is composing the function argument `nextFunction` with the function `func`, which is passed into the pipeline constructor. Note that you use the `Compose` function defined in chapter 2 in Listing 2.3 to combine the functions `func` and `nextFunction`: ``` Func<A, C> Compose<A, B, C>(this Func<A, B> f, Func<B, C> g) => (n) => g(f(n)); ``` When the pipeline starts the process, it applies the final composed function to each input value. The parallelism in the pipeline is achieved in the `Execute` function, which spawns one task for each `BlockingCollection` instantiated. This guarantees a buffer for running the thread. The tasks are created with the `LongRunning` option to schedule a dedicated thread. The `BlockingCollection` concurrent collection allows thread-safe access to the items stored using the static methods `TakeFromAny` and `AddToAny`, which internally distribute the items and balance the workload among the running threads. This collation is used to manage the connection between the input and output of the pipeline, which behave as producer/consumer threads. The pipeline constructor is set as `private` to avoid direct instantiation. Instead, the static method `Create` initializes a new instance of the pipeline. This facilitates a fluent API approach to manipulate the pipeline. This pipeline design ultimately resembles a parallel Produce/Consumer pattern capable of managing the concurrent communication between many-producers to many-consumers. The following listing uses the implemented pipeline to refactor the `DetectFaces` program from the previous section. In C#, a fluent API approach is a convenient way to express and compose the steps of the pipeline. Listing 7.13 Refactored `D``etect``F``aces` code using the parallel pipeline ``` var files = Directory.GetFiles(ImagesFolder); var imagePipe = Pipeline<string, Image<Bgr, byte>> .Create(filePath => new Image<Bgr, byte>(filePath)) .Then(image => Tuple.Create(image, image.Convert<Gray, byte>())) .Then(frames => Tuple.Create(frames.Item1, CascadeClassifierThreadLocal.Value.DetectMultiScale(frames.Item2, 1.1, 3, System.Drawing.Size.Empty))) .Then(faces =>{ foreach (var face in faces.Item2) faces.Item1.Draw(face, ➥ new Bgr(System.Drawing.Color.BurlyWood), 3); return faces.Item1.ToBitmap(); }); ① imagePipe.Execute(cancellationToken); ② foreach (string fileName in files) imagePipe.Enqueue(file, (_, bitmapImage) => Images.Add(bitmapImage)); ③ ``` By exploiting the pipeline you developed, the code structure is changed considerably. The pipeline definition is elegant, and it can be used to construct the process to detect the faces in the images using a nice, fluent API. Each function is composed step by step, and then the `Execute` function is called to start the pipeline. Because the underlying pipeline processing is already running in parallel, the loop to push the file path of the images is sequential. The `Enqueue` function of the pipeline is non-blocking, so there are no performance penalties involved. Later, when an image is returned from the computation, the `Callback` passed into the `Enqueue` function will update the result to update the UI. Table 7.1 shows the benchmark to compare the different approaches implemented. Table 7.1 Benchmark processing of 100 images using four logical core computers with 16 GB RAM. The results, expressed in seconds, represent the average from running each design three times. | **Serial loop** | **Parallel** | **Parallel continuation** | **Parallel LINQ combination** | **Parallel pipeline** | | --- | --- | --- | --- | --- | | 68.57 | 22.89 | 19.73 | 20.43 | 17.59 | The benchmark shows that, over the average of downloading 100 images for three times, the pipeline parallel design is the fastest. It’s also the most expressive and concise pattern. ## Summary * Task-based parallel programs are designed with the functional paradigm in mind to guarantee more reliable and less vulnerable (or corrupt) code from functional properties such as immutability, isolation of side effects, and defensive copy. This makes it easier to ensure that your code is correct. * The Microsoft TPL embraces functional paradigms in the form of using a continuation-passing style. This allows for a convenient way to chain a series of non-blocking operations. * A method that returns `void` in C# code is a string signal that can produce side effects. A method with `void` as output doesn’t permit composition in tasks using continuation. * FP unmasks mathematical patterns to ease parallel task composition in a declarative and fluent programming style. (The monad and functor patterns are hidden in LINQ.) The same patterns can be used to reveal monadic operations with tasks, exposing a LINQ-semantic style. * A functional parallel pipeline is a pattern designed to compose a series of operations into one function, which is then applied concurrently to a sequence of input values queued to be processed. Pipelines are often useful when the data elements are received from a real-time event stream. * Task dependency is the Achilles heel of parallelism. Parallelism is restricted when two or more operations cannot run until other operations have completed. It’s essential to use tools and patterns to maximize parallelism as much as possible. A functional pipeline, CPS, and mathematical patterns like monad are the keys.* *# 8 Task asynchronicity for the win **This chapter covers** * Understanding the Task-based Asynchronous Programming model (TAP) ** Performing numerous asynchronous operations in parallel* Customizing asynchronous execution flow* *Asynchronous programming has become a major topic of interest over the last several years. In the beginning, asynchronous programming was used primarily on the client side to deliver a responsive GUI and to convey a high-quality user experience for customers. To maintain a responsive GUI, asynchronous programming must have consistent communication with the backend, and vice versa, or delays may be introduced into the response time. An example of this communication issue is when an application window appears to hang for a few seconds while background processing catches up with your commands. Companies must address increasing client demands and requests while analyzing data quickly. Using asynchronous programming on the application’s server side is the solution to allowing the system to remain responsive, regardless of the number of requests. Moreover, from a business point of view, an Asynchronous Programming Model (APM) is beneficial. Companies have begun to realize that it’s less expensive to develop software designed with this model because the number of servers required to satisfy requests is considerably reduced by using a non-blocking (asynchronous) I/O system compared to a system with blocking (synchronous) I/O operations. Keep in mind that scalability and asynchronicity are terms unrelated to speed or velocity. Don’t worry if these terms are unfamiliar; they’re covered in the following sections. Asynchronous programming is an essential addition to your skill set as a developer because programming robust, responsive, and scalable programs is, and will continue to be, in high demand. This chapter will help you understand the performance semantics related to APM and how to write scalable applications. By the end of this chapter, you’ll know how to use asynchronicity to process multiple I/O operations in parallel regardless of the hardware resources available. ## 8.1 The Asynchronous Programming Model (APM) The word *asynchronous* derives from the combination of the Greek words *asyn* (meaning “not with”) and *chronos* (meaning “time”), which describe actions that aren’t occurring at the same time. In the context of running a program asynchronously, *asynchronous* refers to an operation that begins with a specific request, which may or may not succeed, and that completes at a point in the future. In general, asynchronous operations are executed independently from other processes without waiting for the result, whereas synchronous operations wait to finish before moving on to another task. Imagine yourself in a restaurant with only one server. The server comes to your table to take the order, goes to the kitchen to place the order, and then stays in the kitchen, waiting for the meal to be cooked and ready to serve! If the restaurant had only one table, this process would be fine, but what if there are numerous tables? In this case, the process would be slow, and you wouldn’t receive good service. A solution is to hire more waiters, maybe one per table, which would increase the restaurant’s overhead due to increased salaries and would be wildly inefficient. A more efficient and effective solution is to have the server deliver the order to the chef in the kitchen and then continue to serve other tables. When the chef has finished preparing the meal, the waiter receives a notification from the kitchen to pick up the food and deliver it to your table. In this way, the waiter can serve multiple tables in a timely fashion. In computer programming, the same concept applies. Several operations are performed asynchronously, from starting execution of an operation to continuing to process other work while waiting for that operation to complete, then resuming execution once the data has been received. Asynchronous programs don’t sit idly waiting for any one operation, such as requesting data from a web service or querying a database, to complete. ### 8.1.1 The value of asynchronous programming Asynchronous programming is an excellent model to exploit every time you build a program that involves blocking I/O operations. In synchronous programming, when a method is called, the caller is blocked until the method completes its current execution. With I/O operations, the time that the caller must wait before a return of control to continue with the rest of code varies depending on the current operation in process. Often applications use a large number of external services, which perform operations that take a user noticeable time to execute. For this reason, it’s vital to program in an asynchronous way. In general, developers feel comfortable when thinking sequentially: send a request or execute a method, wait for the response, and then process it. But a performant and scalable application cannot afford to wait synchronously for an action to complete. Furthermore, if an application joins the results of multiple operations, it’s necessary to perform all of these operations simultaneously for good performance. What happens if control never comes back to the caller because something went wrong during the I/O operation? If the caller never receives a return of control, then the program could hang. Let’s consider a server-side, multiuser application. For example, a regular e-commerce website application exists where, for each incoming request, the program has to make a database call. If the program is designed to run synchronously (figure 8.1), then only one dedicated thread is committed for each incoming request. In this case, each additional database call blocks the current thread that owns the incoming request, while waiting for the database to respond with the result. During this time, the thread pool must create a new thread to satisfy each incoming request, which will also block program execution while waiting for the database response. If the application receives a high volume of requests (hundreds or perhaps thousands) simultaneously, the system will become unresponsive while trying to create the many threads needed to handle the requests. It will continue in this way until reaching the thread-pool limit, now with the risk of running out of resources. These circumstances can lead to large memory consumption or worse, failure of the system. When the thread-pool resources are exhausted, successive incoming requests are queued and waiting to be processed, which results in an unreactive system. More importantly, when the database responses come back, the blocked threads are freed to continue to process the requests, which can provoke a high frequency of context switches, negatively impacting performance. Consequently, the client requests to the website slow down, the UI turns unresponsive, and ultimately your company loses potential customers and revenue.  Figure 8.1 Servers that handle incoming requests synchronously aren’t scalable. Clearly, efficiency is a major reason to asynchronously model operations so that threads don’t need to wait for I/O operations to complete, allowing them to be reused by the scheduler to serve other incoming requests. When a thread that has been deployed for an asynchronous I/O operation is idle, perhaps waiting for a database response as in figure 8.1, the scheduler can send the thread back to the thread pool to engage in further work. When the database completes, the scheduler notifies the thread pool to wake an available thread and send it on to continue the operation with the database result. In server-side programs, asynchronous programming lets you deal effectively with massive concurrent I/O operations by intelligently recycling resources during their idle time and by avoiding the creation of new resources (figure 8.2). This optimizes memory consumption and enhances performance. Users ask much from the modern applications they must interact with. Modern applications must communicate with external resources, such as databases and web services, and work with disks or REST-based APIs to meet user demands. Also, today’s applications must retrieve and transform massive amounts of data, cooperate in cloud computations, and respond to notifications from parallel processes. To accommodate these complex interactions, the APM provides the ability to express computations without blocking executing threads, which improves availability (a reliable system) and throughput. The result is notably improved performance and scalability. This is particularly relevant on servers where there can be a large amount of concurrent I/O-bound activity. In this case the APM can handle many concurrent operations with low memory consumption due to the small number of threads involved. Even in the case where there aren’t many (thousands) concurrent operations, the synchronous approach is advantageous because it keeps the I/O-bound operations performing out of the .NET thread pool.  Figure 8.2 Asynchronous I/O operations can start several operations in parallel without constraints that will return to the caller when complete, which keeps the system scalable. By enabling asynchronous programming in your application, your code derives several benefits: * Decoupled operations do a minimum amount of work in performance-critical paths. * Increased thread resource availability allows the system to reuse the same resources without the need to create new ones. * Better employment of the thread-pool scheduler enables scalability in server-based programs. ### 8.1.2 Scalability and asynchronous programming Scalability refers to a system with the ability to respond to an increased number of requests through the addition of resources, which affects a commensurate boost in parallel speedup. A system designed with this ability aims to continue performing well under the circumstance of a sustained, large number of incoming requests that can strain the application’s resources. Incremental scalability is achieved by different components—memory and CPU bandwidth, workload distribution, and quality of code, for example. If you design your application with the APM, it’s most likely scalable. Keep in mind that scalability isn’t about speed. In general, a scalable system doesn’t necessarily run faster than a non-scalable system. In fact, an asynchronous operation doesn’t perform any faster than the equivalent synchronous operation. The true benefit is in minimizing performance bottlenecks in the application and optimizing the consumption of resources that allow other asynchronous operations to run in parallel, ultimately performing faster. Scalability is vital in satisfying today’s increasing demands for instantaneous responsiveness. For example, in high-volume web applications, such as stock trading or social media, it’s essential that applications be both responsive and capable of concurrently managing a massive number of requests. Humans naturally think sequentially, evaluating one action at a time in consecutive order. For the sake of simplicity, programs have been written in this manner, one step following the other, which is clumsy and time-consuming. The need exists for a new model, the APM, that lets you write non-blocking applications that can run out of sequence, as required, with unbounded power. ### 8.1.3 CPU-bound and I/O-bound operations In CPU-bound computations, methods require CPU cycles where there’s one thread running on each CPU to do the work. In contrast, asynchronous I/O-bound computations are unrelated to the number of CPU cores. Figure 8.3 shows the comparison. As previously mentioned, when an asynchronous method is called, the execution thread returns immediately to the caller and continues execution of the current method, while the previously called function runs in the background, thereby preventing *blocking*. The terms *non-blocking* and *asynchronous* are commonly used interchangeably because both define similar concepts.  Figure 8.3 Comparison between CPU-bound and I/O-bound operations *CPU-bound computations* are operations that spend time performing CPU-intensive work, using hardware resources to run all the operations. Therefore, as a ratio, it’s appropriate to have one thread for each CPU, where execution time is determined by the speed of each CPU. Conversely, with *I/O-bound computations*, the number of threads running is unrelated to the number of CPUs available, and execution time depends on the period spent waiting for the I/O operations to complete, bound only by the I/O drivers. ## 8.2 Unbounded parallelism with asynchronous programming Asynchronous programming provides an effortless way to execute multiple tasks independently and, therefore, in parallel. You may be thinking about CPU-bound computations that can be parallelized using a task-based programming model (chapter 7). But what makes an APM special, as compared to CPU-bound computation, is its I/O-bound computation nature, which overcomes the hardware constraint of one working thread for each CPU core. Asynchronous, non-CPU-bound computations can benefit from having a larger number of threads running on one CPU. It’s possible to perform hundreds or even thousands of I/O operations on a single-core machine because it’s the nature of asynchronous programming to take advantage of parallelism to run I/O operations that can outnumber the available cores in a computer by an order of magnitude. You can do this because the asynchronous I/O operations push the work to a different location without impacting local CPU resources, which are kept free, providing the opportunity to execute additional work on local threads. To demonstrate this unbounded power, Listing 8.1 is an example of running 20 asynchronous operations (in bold). These operations can run in parallel, regardless of the number of available cores. Listing 8.1 Parallel asynchronous computations ``` let httpAsync (url : string) = **async** { ① let req = WebRequest.Create(url) let! resp = req.**Async**GetResponse() use stream = resp.GetResponseStream() use reader = new StreamReader(stream) let! text = reader.ReadToEnd**Async**() return text } let sites = ② [ "http://www.live.com";" "http://www.fsharp.org"; "http://news.live.com"; "http://www.digg.com"; "http://www.yahoo.com"; "http://www.amazon.com" "http://news.yahoo.com"; "http://www.microsoft.com"; "http://www.google.com"; "http://www.netflix.com"; "http://news.google.com"; "http://www.maps.google.com"; "http://www.bing.com"; "http://www.microsoft.com"; "http://www.facebook.com"; "http://www.docs.google.com"; "http://www.youtube.com"; "http://www.gmail.com"; "http://www.reddit.com"; "http://www.twitter.com"; ] sites |> Seq.map httpAsync ③ |> **Async.Parallel** ④ |> Async.RunSynchronously ⑤ ``` In this full asynchronous implementation example, the execution time is 1.546 seconds on a four-core machine. The same synchronous implementation runs in 11.230 seconds (the synchronous code is omitted, but you can find it in the source code companion of this book). Although the time varies according to network speed and bandwidth, the asynchronous code is about 7× faster than the synchronous code. In a CPU-bound operation running on a single-core device, there’s no performance improvement in simultaneously running two or more threads. This can reduce or decrease performance due to the extra overhead. This also applies to multicore processors, where the number of threads running far exceeds the number of cores. Asynchronicity doesn’t increase CPU parallelism, but it does increment performance and reduce the number of threads needed. Despite many attempts to make operating system threads cheap (low memory consumption and overhead for their instantiation), their allocation produces a large memory stack, becoming an unrealistic solution for problems that require numerous outstanding asynchronous operations. This was discussed in section 7.1.2. ## Asynchrony vs. parallelism Parallelism is primarily about application performance, and it also facilitates CPU-intensive work on multiple threads, taking advantage of modern multicore computer architectures. Asynchrony is a superset of concurrency, focusing on I/O-bound rather than CPU-bound operations. Asynchronous programming addresses the issue of latency (anything that takes a long time to run). ## 8.3 Asynchronous support in .NET The APM has been a part of the Microsoft .NET Framework since the beginning (v1.1). It offloads the work from the main execution thread to other working threads with the purpose of delivering better responsiveness and of gaining scalability. The original asynchronous programming pattern consists of splitting a long-running function into two parts. One part is responsible for starting the asynchronous operation (`Begin`), and the other part is invoked when the operation completes (`End`). This code shows a synchronous (blocking) operation that reads from a file stream and then processes the generated byte array (the code to note is in bold): ``` void ReadFileBlocking(string filePath, Action<byte[]> process) { using (var fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.Read)) { byte[] buffer = new byte[fileStream.Length]; int bytesRead = fileStream.**Read**(buffer, 0, buffer.Length); process(buffer); } } ``` Transforming this code into an equivalent asynchronous (*non-blocking*) operation requires a notification in the form of a *callback* to continue the original call-site (where the function is called) upon completion of the asynchronous I/O operation. In this case, the callback keeps the opportune state from the `Begin` function, as shown in the following listing (the code to note is highlighted in bold). The state is then rehydrated (restored to its original representation) when the callback resumes. Listing 8.2 Reading from the filesystem asynchronously ``` IAsyncResult ReadFileNoBlocking(string filePath, Action<byte[]> process) { var fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.Read, 0x1000, **FileOptions.Asynchronous**) ① byte[] buffer = new byte[fileStream.Length]; **var state = Tuple.Create(buffer, fileStream, process);** ② return fileStream.**BeginRead**(buffer, 0, buffer.Length, EndRead**Callback**, state); ③ } void EndRead**Callback**(IAsyncResult ar) { var state = **ar.AsyncState;** **as** (**Tuple<byte[], FileStream, Action<byte[]>>)** ④ using (state.Item2) state.Item2.EndRead(ar); ⑤ state.Item3(state.Item1); ⑤ } ``` Why is the asynchronous version of the operation that’s using the Begin/End pattern not blocking? Because when the I/O operation starts, the thread in context is sent back to the thread pool to perform other useful work if needed. In .NET, the threadpool scheduler is responsible for scheduling work to be executed on a pool of threads, managed by the CLR. Writing APM programs is considered more difficult than writing the sequential version. An APM program requires more code, which is more complex and harder to read and write. The code can be even more convoluted if a series of asynchronous operations is chained together. In the next example, a series of asynchronous operations require a notification to proceed with the work assigned. The notification is achieved through a callback. This chain of asynchronous operations in the code produces a series of nested callbacks, also known as “callback hell”([`callbackhell.com`](http://callbackhell.com)). Callback-based code is problematic because it forces the programmer to cede control, restricting expressiveness and, more importantly, eliminating the compositionality semantic aspect. This is an example of code (conceptual) to read from a file stream, then compress and send the data to the network (the code to note is in bold): ``` IAsyncResult ReadFileNoBlocking(string filePath) { // keep context and BeginRead } void **EndReadCallback**(IAsyncResult ar) { // get Read and rehydrate state, then BeginWrite (compress) } void **EndCompressCallback**(IAsyncResult ar) { // get Write and rehydrate state, then BeginWrite (send to the network) } void **EndWriteCallback**(IAsyncResult ar) { // get Write and rehydrate state, completed process } ``` How would you introduce more functionality to this process? The code isn’t easy to maintain! How can you compose this series of asynchronous operations to avoid the callback hell? And where and how would you manage error handling and release resources? The solutions are complex! In general, the asynchronous Begin/End pattern is somewhat workable for a single call, but it fails miserably when composing a series of asynchronous operations. Later in this chapter I’ll show how to conquer exceptions and cancellations such as these. ### 8.3.1 Asynchronous programming breaks the code structure As you can see from the previous code, an issue originating from traditional APM is the decoupled execution time between the start (`Begin`) of the operation and its callback notification (`End`). This broken-code design divides the operation in two, violating the imperative sequential structure of the program. Consequently, the operation continues and completes in a different scope and possibly in a different thread, making it hard to debug, difficult to handle exceptions, and impossible to manage transaction scopes. In general, with the APM pattern, it’s a challenge to maintain state between each asynchronous call. You’re forced to pass a state into each continuation through the callback to continue the work. This requires a tailored state machine to handle the passing of state between each stage of the asynchronous pipeline. In the previous example, to maintain the state between the `fileStream.BeginRead` and its callback `EndReadCallback`, a tailored `state` object was created to access the stream, the byte array buffer, and the function process: ``` var **state** = Tuple.Create(buffer, fileStream, process); ``` This `state` object was rehydrated when the operation completed to access the underlying objects to continue further work. ### 8.3.2 Event-based Asynchronous Programming Microsoft recognized the intrinsic problems of APM and consequently introduced (with .NET 2.0) an alternate pattern called Event-based Asynchronous Programming (EAP).^(1) The EAP model was the first attempt to address issues with APM. The idea behind EAP is to set up an event handler for an event to notify the asynchronous operation when a task completes. This event replaces the callback notification semantic. Because the event is raised on the correct thread and provides direct support access to UI elements, EAP has several advantages. Additionally, it’s built with support for progress reporting, canceling, and error handling—all occurring transparently for the developer. EAP provides a simpler model for asynchronous programming than APM, and it’s based on the standard event mechanism in .NET, rather than on requiring a custom class and callbacks. But it’s still not ideal because it continues to separate your code into method calls and event handlers, increasing the complexity of your program’s logic. ## 8.4 C# Task-based Asynchronous Programming Compared to its predecessor, .NET APM, Task-based Asynchronous Programming (TAP) aims to simplify the implementation of asynchronous programs and ease composition of concurrent operation sequences. The TAP model deprecates both APM and EAP, so if you’re writing asynchronous code in C#, TAP is the recommended model. TAP presents a clean and declarative style for writing asynchronous code that looks similar to the F# asynchronous workflow by which it was inspired. The F# asynchronous workflow will be covered in detail in the next chapter. In C# (since version 5.0), the objects `Task` and `Task<T>`, with the support of the keywords `async` and `await`, have become the main components to model asynchronous operations. The TAP model solves the callback problem by focusing purely on the syntactic aspect, while bypassing the difficulties that arise in reasoning about the sequence of events expressed in the code. Asynchronous functions in C# 5.0 address the issue of latency, which refers to anything that takes time to run. The idea is to compute an asynchronous method, returning a task (also called a *future*) that isolates and encapsulates a long-running operation that will complete at a point in the future, as shown in figure 8.4.  Figure 8.4 The task acts as a channel for the execution thread, which can continue working while the caller of the operation receives the handle to the task. When the operation completes, the task is notified and the underlying result can be accessed. Here’s the task flow from figure 8.4: 1. The I/O operation starts asynchronously in a separate execution thread. A new task instance is created to handle the operation. 2. The task created is returned to the caller. The task contains a callback, which acts as a channel between the caller and the asynchronous operation. This channel communicates when the operation completes. 3. The execution thread continues the operation while the main thread from the operation caller is available to process other work. 4. The operation completes asynchronously. 5. The task is notified, and the result is accessible by the caller of the operation. The `Task` object returned from the `async/await` expression provides the details of the encapsulated computation and a reference to its result that will become available when the operation itself is completed. These details include the status of the task, the result, if completed, and exception information, if any. The .NET `Task` and `Task<T>` constructs were introduced in the previous chapter, specifically for CPU-bound computations. The same model in combination with the `async/await` keywords can be used for I/O-bound operations. In a nutshell, TAP consists of the following: * The `Task` and `Task<T>` constructs to represent asynchronous operations * The `await` keyword to wait for the task operation to complete asynchronously, while the current thread isn’t blocked from performing other work For example, given an operation to execute in a separate thread, you must wrap it into a `Task`: ``` **Task<int[]>** processDataTask = **Task**.Run(() => ProcessMyData(data)); // do other work var result = processDataTask.**Result**; ``` The output of the computation is accessed through the `Result` property, which blocks the caller method until the task completes. For tasks that don’t return a result, you could call the `Wait` method instead. But this isn’t recommended. To avoid blocking the caller thread, you can use the `Task` `async/await` keywords: ``` **Task<int[]>** processDataTask = Task.Run(**async** () => ProcessMyData(data)); // do other work var result = **await** processDataTask; ``` The `async` keyword notifies the compiler that the method runs asynchronously without blocking. By doing so, the calling thread will be released to process other work. Once the task completes, an available worker thread will resume processing the work. Here’s the previous code example converted to read a file stream asynchronously using the TAP way (the code to note is in bold): ``` **async** void ReadFileNoBlocking(string filePath, Action<byte[]> process) { using (var fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.Read, 0x1000, FileOptions.Asynchronous)) { byte[] buffer = new byte[fileStream.Length]; int bytesRead = **await** fileStream.**ReadAsync**(buffer, 0, buffer.Length); **await** Task.Run(**async** () => process(buffer)); } } ``` The method `ReadFileNoBlocking` is marked `async`, the contextual keyword used to define an asynchronous function and to enable the use of the `await` keyword within a method. The purpose of the `await` construct is to inform the C# compiler to translate the code into a *continuation* of a task that won’t block the current context thread, freeing the thread to do other work. Under the hood, the continuation is implemented using the `ContinuesWith` function from the `Task` object, which is triggered when the asynchronous operation has completed. The advantage of having the compiler build the continuation is to preserve the program structure and the asynchronous method calls, which then are executed without the need for callbacks or nested lambda expressions. This asynchronous code has clear semantics organized in a sequential flow. In general, when a method marked as `async` is invoked, the execution flow runs synchronously until it reaches an `await`-able task, denoted with the `await` keyword, that hasn’t yet completed. When the execution flow reaches the `await` keyword, it suspends the calling method and yields control back to its caller until the awaited task is complete; in this way, the execution thread isn’t blocked. When the operation completes, its result is unwrapped and bound into the content variable, and then the flow continues with the remaining work. An interesting aspect of TAP is that the execution thread captures the synchronization context and serves back to the thread that continues the flow, allowing direct UI updates without extra work. ### 8.4.1 Anonymous asynchronous lambdas You may have noticed a curious occurrence in the previous code—an anonymous function was marked `async`: ``` **await** Task.Run(**async** () => process(buffer)); ``` As you can see, in addition to ordinary named methods, anonymous methods can also be marked `async`. Here’s an alternative syntax to make an asynchronous anonymous lambda: ``` Func<string, Task<byte[]>> downloadSiteIcone = **async** domain => { var response = **await** new HttpClient().Get**Async**($"http://{domain}/favicon.ico"); return **await** response.Content.ReadAsByteArray**Async**(); } ``` This is also called an *asynchronous lambda*,^(2) which is like any other lambda expression, only with the `async` modifier at the beginning to allow the use of the `await` keyword in its body. Asynchronous lambdas are useful when you want to pass a potentially long-running delegate into a method. If the method accepts a `Func<Task>`, you can feed it an async lambda and get the benefits of asynchrony. Like any other lambda expression, it supports closure to capture variables and the asynchronous operation start, only when the delegate is invoked. This feature provides an easy means for expressing asynchronous operations on the fly. Inside these asynchronous functions, the `await` expressions can wait for running tasks. This causes the rest of the asynchronous execution to be transparently enlisted as a continuation of the awaited task. In anonymous asynchronous lambdas, the same rules apply as in ordinary asynchronous methods. You can use them to keep code concise and to capture closures. ### 8.4.2 Task<T> is a monadic container In the previous chapter, you saw that the `Task<T>` type can be thought of as a special wrapper, eventually delivering a value of type `T` if it succeeds. The `Task<T>` type is a monadic data structure, which means, among other things, that it can easily be composed with others. It’s not a surprise that the same concept also applies to the `Task<T>` type used in TAP. With this in mind, you can easily define the monadic operators `Bind` and `Return`. In particular, the `Bind` operator uses the continuation-passing approach of the underlying asynchronous operation to generate a flowing and compositional semantic programming style. Here’s their definition, including the *functor map* (or *fmap*) operator: ``` static Task<T> **Return**<T>(T task)=> Task.FromResult(task); static **async** Task<R> **Bind**<T, R>(this Task<T> task, Func<T, Task<R>> cont) => **await** cont(**await** task.ConfigureAwait(false)).ConfigureAwait(false); static **async** Task<R> **Map**<T, R>(this Task<T> task, Func<T, R> map) => map(**await** task.ConfigureAwait(false)); ``` The definitions of the functions `Map` and `Bind` are simple due to the use of the `await` keyword, as compared to the implementation of `Task<T>` for CPU-bound computations in the previous chapter. The `Return` function lifts a `T` into a `Task<T>` container. `ConfigureAwait`^(3) in a `Task` extension method removes the current UI context. This is recommended to obtain better performance in cases where the code doesn’t need to be updated or doesn’t need to interact with the UI. Now these operators can be exploited to compose a series of asynchronous computations as a chain of operations. The following listing downloads and writes asynchronously into the filesystem an icon image from a given domain. The operators `Bind` and `Map` are applied to chain the asynchronous computations (in bold). Listing 8.3 Downloading an image (icon) from the network asynchronously ``` **async** Task DownloadIconAsync(string domain, string fileDestination) { using (FileStream stream = new FileStream(fileDestination, FileMode.Create, FileAccess.Write, FileShare.Write, 0x1000, FileOptions.Asynchronous)) **await** new HttpClient() .GetAsync($"http://{domain}/favicon.ico") .**Bind**(**async** content => **await** ① content.Content.ReadAsByteArrayAsync()) .**Map**(bytes => Image.FromStream(new MemoryStream(bytes))) ② .Tap(**async** image => ③ await SaveImageAsync(fileDestination, ➥ ImageFormat.Jpeg, image)); ``` In this code, the method `DownloadIconAsync` uses an instance of the `HttpClient` object to obtain asynchronously the `HttpResponseMessage` by calling the `GetAsync` method. The purpose of the response message is to read the HTTP content (in this case, the image) as a byte array. The data is read by the `Task.Bind` operator, and then converted into an image using the `Task.Map` operator. The function `Task.Tap` (also known as *k-combinator)* is used to facilitate a pipeline construct to cause a side effect with a given input and return the original value. Here’s the implementation of the `Task.Tap` function: ``` static async Task<T> **Tap**<T>(this Task<T> task, Func<T, Task> action) { await action(await task); return await task; } ``` The `Tap` operator is extremely useful to bridge void functions (such as logging or writing a file or an HTML page) in your composition without having to create additional code. It does this by passing itself into a function and returning itself. `Tap` unwraps the underlying elevated type, applies an action to produce a side effect, and then wraps the original value up again and returns it. Here, the side effect is to persist the image into the filesystem. The function `Tap` can be used for other side effects as well. At this point, these monadic operators can be used to define the LINQ pattern implementing `Select` and `SelectMany`, similar to the `Task` type in the previous chapter, and to enable LINQ compositional semantics: ``` static **async** Task<R> SelectMany<T, R>(this Task<T> task, Func<T, Task<R>> then) => **await** Bind(**await** task); static **async** Task<R> SelectMany<T1, T2, R>(this Task<T1> task, Func<T1, Task<T2>> bind, Func<T1, T2, R> project) { T taskResult = **await** task; return project(taskResult, **await** bind(taskResult)); } static **async** Task<R> Select<T, R>(this Task<T> task, Func<T, R> project) => **await** Map(task, project); static **async** Task<R> Return<R>(R value) => Task.FromResult(value); ``` The `SelectMany` operator is one of the many functions capable of extending the asynchronous LINQ-style semantic. The job of the `Return` function is to lift the value `R` into a `Task<R>`. The `async/await` programming model in C# is based on tasks, and as mentioned in the previous chapter, it’s close in nature to the monadic concept of the operators `Bind` and `Return`. Consequently, it’s possible to define many of the LINQ query operators, which rely on the `SelectMany` operator. The important point is that using patterns such as monads provides the opportunity to create a series of reusable combinators and eases the application of techniques allowing for improved composability and readability of the code using LINQ-style semantic. Here’s the previous `DownloadIconAsync` example refactored using the LINQ expression semantic: ``` **async** Task DownloadIconAsync(string domain, string fileDestination) { using (FileStream stream = new FileStream(fileDestination, FileMode.Create, FileAccess.Write, FileShare.Write, 0x1000, FileOptions.Asynchronous)) await (**from** response in new HttpClient() .GetAsync($"http://{domain}/favicon.ico") **from** bytes in response.Content.ReadAsByteArrayAsync() **select** Bitmap.FromStream(new MemoryStream(bytes))) .Tap(async image => (await image).Save(fileDestination)); } ``` Using the LINQ comprehension version, the `from` clause extracts the inner value of the `Task` from the async operation and binds it to the related value. In this way, the keywords `async/await` can be omitted because of the underlying implementation. TAP can be used to parallelize computations in C#, but as you saw, parallelization is only one aspect of TAP. An even more enticing proposition is writing asynchronous code that composes easily with the least amount of noise. ## 8.5 Task-based Asynchronous Programming: a case study Programs that compute numerous I/O operations that consume a lot of time are good candidates for demonstrating how asynchronous programming works and the powerful toolset that TAP provides to a developer. As an example, in this section TAP is examined in action by implementing a program that downloads from an HTTP server and analyzes the stock market history of a few companies. The results are rendered in a chart that’s hosted in a Windows Presentation Foundation (WPF) UI application. Next, the symbols are processed in parallel and program execution is optimized, timing the improvements. In this scenario, it’s logical to perform the operations in parallel asynchronously. Every time you want to read data from the network using any client application, you should call non-blocking methods that have the advantage of keeping the UI responsive (figure 8.5).  Figure 8.5 Downloading historical stock prices asynchronously in parallel. The number of requests can exceed the number of available cores, and yet you can maximize parallelism. Listing 8.4 shows the main part of the program. For the charting control, you’ll use the Microsoft `Windows.Forms.DataVisualization` control.^(4) Let’s examine the asynchronous programming model on .NET in action. First, let’s define the data structure `StockData` to hold the daily stock history: ``` struct StockData { public StockData(DateTime date, double open, double high, double low, double close) { Date = date; Open = open; High = high; Low = low; Close = close; } public DateTime Date { get; } public Double Open { get; } public Double High { get; } public Double Low { get; } public Double Close { get; } } ``` Several historical data points exist for each stock, so `StockData` in the shape of value-type struct can increase performance due to memory optimization. The following listing downloads and analyzes the historical stock data asynchronously (the code to note is in bold). Listing 8.4 Analyzing the history of stock prices ``` **async** Task<StockData[]> ConvertStockHistory(string stockHistory) ① { return **await** Task.Run(() => { ② string[] stockHistoryRows = stockHistory.Split(Environment.NewLine.ToCharArray(), StringSplitOptions.RemoveEmptyEntries); return (from row in stockHistoryRows.Skip(1) let cells = row.Split(',') let date = DateTime.Parse(cells[0]) let open = double.Parse(cells[1]) let high = double.Parse(cells[2]) let low = double.Parse(cells[3]) let close = double.Parse(cells[4]) select new StockData(date, open, high, low, close)) .ToArray(); }); } ① **async** Task<string> DownloadStockHistory(string symbol) { string url = $"http://www.google.com/finance/historical?q={symbol}&output=csv"; var request = WebRequest.Create(url); ③ using (var response = **await** request.GetResponseAsync() .ConfigureAwait(false)) ④ using (var reader = new StreamReader(response.GetResponseStream())) return **await** reader.ReadToEndAsync().ConfigureAwait(false); ⑤ } **async** Task<Tuple<string, StockData[]>> ProcessStockHistory(string symbol) { string stockHistory = **await** DownloadStockHistoryAsync(symbol); ⑥ StockData[] stockData = **await** ConvertStockHistory(stockHistory); ⑥ return Tuple.Create(symbol, stockData); ⑦ } async Task AnalyzeStockHistory(string[] stockSymbols) { var sw = Stopwatch.StartNew(); IEnumerable<Task<Tuple<string, StockData[]>>> stockHistoryTasks = stockSymbols.Select(stock => ProcessStockHistory(stock)); ⑧ var stockHistories = new List<Tuple<string, StockData[]>>(); foreach (var stockTask in stockHistoryTasks) stockHistories.Add(**await** stockTask); ⑨ ShowChart(stockHistories, sw.ElapsedMilliseconds); ⑩ } ``` The code starts creating a web request to obtain an HTTP response from the server so you can retrieve the underlying `ResponseStream` to download the data. The code uses the instance methods `GetReponseAsync()` and `ReadToEndAsync()` to perform the I/O operations, which can take a long time. Therefore, they’re running asynchronously using the TAP pattern. Next, the code instantiates a `StreamReader` to read the data in a *comma-separated* *values* (CSV) format. The CSV data is then parsed in an understandable structure, the object `StockData`, using a LINQ expression and the function `ConvertStockHistory`. This function performs the data transformation using `Task.Run`,^(5) which runs the supplied lambda on the `ThreadPool`. The function `ProcessStockHistory` downloads and converts the stock history asynchronously, then returns a `Tuple` object. Specifically, this return type is `Task<Tuple<string, StockData[]>>.` Interestingly, in this method, when the tuple is instantiated at the end of the method, there’s no presence of any `Task`. This behavior is possible because the method is marked with the `async` keyword, and the compiler wraps the result automatically into a `Task` type to match the signature. In TAP, by denoting a method as `async`, all wrapping and unwrapping required to turn the result into a task (and vice versa) are handled by the compiler. The resulting data is sent to the method `ShowChart` to display the stock history and the elapsed time. (The implementation of `ShowChart` is online in the source code companion to this book.) The rest of the code is self-explanatory. The time to execute this program—downloading, processing, and rendering the stock historical data for seven companies—is 4.272 seconds. Figure 8.6 shows the results of the stock price variations for Microsoft (MSFT), EMC, Yahoo (YHOO), eBay (EBAY), Intel (INTC), and Oracle (ORCL).  Figure 8.6 Chart of the stock price variations over time As you can see, TAP returns tasks, allowing a natural compositional semantic for other methods with the same return type of `Task`. Let’s review what’s happening throughout the process. You used the Google service in this example to download and analyze the stock market history (Listing 8.4). This is a high-level architecture of a scalable service with similar behavior, as shown in figure 8.7. Here’s the flow of how the Stock Market service processes the requests: 1. The user sends several requests asynchronously in parallel to download stock history prices. The UI remains responsive. 2. The thread pool schedules the work. Because the operations are I/O-bound, the number of asynchronous requests that can run in parallel could exceed the available number of local cores. 3. The Stock Market service receives the HTTP requests, and the work is dispatched to the internal program, which notifies the thread-pool scheduler to asynchronously handle the incoming requests to query the database. 4. Because the code is asynchronous, the thread-pool scheduler can schedule the work by optimizing local hardware resources. In this way, the number of threads required to run the program is kept to a minimum, the system remains responsive, memory consumption is low, and the server is scalable. 5. The database queries are processed asynchronously without keeping threads blocked. 6. When the database completes the work, the result is sent back to the caller. At this point, the `thread-pool` scheduler is notified, and a thread is assigned to continue the rest of the work. 7. The responses are sent back to the Stock Market service caller as they complete. 8. The user starts receiving the responses back from the Stock Market service. 9. The UI is notified, and a thread is assigned to continue the rest of the work without blocking. 10. The data received is parsed, and the chart is rendered.  Figure 8.7 Asynchronous programming model for downloading data in parallel from the network Using the asynchronous approach means all the operations run in parallel, but the overall response time is still correlated to the time of the slowest worker. Conversely, the response time for the synchronous approach increases with each added worker. ### 8.5.1 Asynchronous cancellation When executing an asynchronous operation, it’s useful to terminate execution prematurely before it completes on demand. This works well for long-running, non-blocking operations, where making them cancellable is the appropriate practice to avoid tasks that could hang. You’ll want to cancel the operation of downloading the historical stock prices, for example, if the download exceeds a certain period of time. Starting with version 4.0, the .NET Framework introduced an extensive and convenient approach to cooperative support for canceling operations running in a different thread. This mechanism is an easy and useful tool for controlling task execution flow. The concept of cooperative cancellation allows the request to stop a submitted operation without enforcing the code (figure 8.8). Aborting execution requires code that supports cancellation. It’s recommended that you design a program that supports cancellation as much as possible. These are the .NET types for canceling a `Task` or async operation: * `CancellationTokenSource` is responsible for creating a cancellation token and sending cancellation requests to all copies of that token. * `CancellationToken` is a structure utilized to monitor the state of the current token. Cancellation is tracked and triggered using the cancellation model in the .NET Framework `System.Threading.CancellationToken`.  Figure 8.8 After a request to start a process, a cancellation request is submitted that stops the rest of the execution, which returns to the caller in the form of `OperationCanceledException`. #### Cancellation support in the TAP model TAP supports cancellation natively; in fact, every method that returns a task provides at least one overload with a cancellation token as a parameter. In this case, you can pass a cancellation token when creating the task, then the asynchronous operation checks the status of the token, and it cancels the computation if the request is triggered. To cancel the download of the historical stock prices, you should pass an instance of `CancellationToken` as an argument in the `Task` method and then call the `Cancel` method. The following listing shows this technique (in bold). Listing 8.5 Canceling an asynchronous task ``` **CancellationTokenSource cts = new CancellationTokenSource();** ① **async** Task<string> DownloadStockHistory(string symbol, **CancellationToken token**) ② { string stockUrl = $"http://www.google.com/finance/historical?q={symbol}}&output=csv"; var request = await new HttpClient().GetAsync(stockUrl, **token**); ② return await request.Content.ReadAsStringAsync(); } **cts.Cancel();** ③ ``` Certain programming methods don’t have intrinsic support for cancellation. In those cases, it’s important to apply manual checking. This listing shows how to integrate cancellation support to the previous stock market example where no asynchronous methods exist to terminate operations prematurely. Listing 8.6 Canceling manual checks in an asynchronous operation ``` List<Task<Tuple<string, StockData[]>>> stockHistoryTasks = stockSymbols.Select(async symbol => { var url = $"http://www.google.com/finance/historical?q={symbol}&output=csv"; var request = HttpWebRequest.Create(url); using (var response = await request.GetResponseAsync()) using (var reader = new StreamReader(response.GetResponseStream())) { token.ThrowIfCancellationRequested(); var csvData = await reader.ReadToEndAsync(); var prices = await ConvertStockHistory(csvData); token.ThrowIfCancellationRequested(); return Tuple.Create(symbol, prices.ToArray()); } }).ToList(); ``` In cases like this, where the `Task` method doesn’t provide built-in support for cancellation, the recommended pattern is to add more `CancellationToken`s as parameters of the asynchronous method and to check for cancellation regularly. The option to throw an error with the method `ThrowIfCancellationRequested` is the most convenient to use because the operation would terminate without returning a result. Interestingly, the `CancellationToken` (in bold) in the following listing supports the registration of a callback, which will be executed right after cancellation is requested. In this listing, a `Task` downloads the content of the Manning website, and it’s canceled immediately afterward using a cancellation token. Listing 8.7 Cancellation token callback ``` CancellationTokenSource tokenSource = new CancellationTokenSource(); CancellationToken **token** = tokenSource.Token; Task.Run(async () => { var webClient = new WebClient(); token.**Register**(() => webClient.CancelAsync()); ① var data = await webClient .DownloadDataTaskAsync(http://www.manning.com); }, **token**); tokenSource.Cancel(); ``` In the code, a callback is registered to stop the underlying asynchronous operation in case the `CancellationToken` is triggered. This pattern is useful and opens the possibility of logging the cancellation and firing an event to notify a listener that the operation has been canceled. #### Cooperative cancellation support Use of the `CancellationTokenSource` makes it simple to create a composite token that consists of several other tokens. This pattern is useful if there are multiple reasons to cancel an operation. Reasons could include a click of a button, a notification from the system, or a cancellation propagating from another operation. The `CancellationSource.CreateLinkedTokenSource` method generates a cancellation source that will be canceled when any of the specified tokens is canceled (the code to note is in bold). Listing 8.8 Cooperative cancellation token ``` CancellationTokenSource **ctsOne** = new CancellationTokenSource(); ① CancellationTokenSource **ctsTwo** = new CancellationTokenSource(); ① CancellationTokenSource **ctsComposite** = CancellationTokenSource.CreateLinkedTokenSource(**ctsOne.Token**, **ctsTwo.Token**); ② CancellationToken ctsCompositeToken = ctsComposite.Token; Task.Factory.StartNew(async () => { var webClient = new WebClient(); ctsCompositeToken.**Register**(() => webClient.CancelAsync()); var data = **await** webClient .DownloadDataTaskAsync(http://www.manning.com); }, **ctsComposite.Token**); ③ ``` In this listing, a linked cancellation source is created based on the two cancellation tokens. Then, the new composite token is employed. It will be canceled if any of the original `CancellationToken`s are canceled. A cancellation token is basically a thread-safe flag (Boolean value) that notifies its parent that the `CancellationTokenSource` has been canceled. ### 8.5.2 Task-based asynchronous composition with the monadic Bind operator As mentioned previously, async `Task<T>` is a monadic type, which means that it’s a container where you can apply the monadic operators `Bind` and `Return`. Let’s analyze how these functions are useful in the context of writing a program. Listing 8.9 takes advantage of the `Bind` operator to combine a sequence of asynchronous operations as a chain of computations. The `Return` operator lifts a value into the monad (container or elevated type). In general, a `Task` asynchronous function takes an arbitrary argument type of `'T` and returns a computation of type `Task<'R>` (with signature `'T -> Task<'R>`), and it can be composed using the `Bind` operator. This operator says: “When the value `'R` from the function (`g:'T -> Task<'R>)` is evaluated, it passes the result into the function `(f:'R -> Task<'U>)`.” The function `Bind` is shown in figure 8.9 for demonstration purposes because it’s already built into the system.  Figure 8.9 The `Bind` operator composes two functions that have the result wrapped into a `Task` type, and where the value returned from the computation of the first `Task` matches the input of the second function. With this `Bind` function (in bold in the listing), the structure of the stock analysis code can be simplified. The idea is to glue together a series of functions. Listing 8.9 `Bind` operator in action ``` async Task<Tuple<string, StockData[]>> ProcessStockHistory(string symbol) { return **await** DownloadStockHistory(symbol) .**Bind**(stockHistory => ConvertStockHistory(stockHistory)) ① .**Bind**(stockData => Task.FromResult(Tuple.Create(symbol, stockData))); ① } ``` The asynchronous `Task` computations are composed by invoking the `Bind` operator on the first async operation and then passing the result to the second async operation, and so forth. The result is an asynchronous function that has as an argument the value returned by the first `Task` when it completes. It returns a second `Task` that uses the result of the first as input for its computation. The code is both declarative and expressive because it fully embraces the functional paradigm. You’ve now used a monadic operator: specifically, one based on the continuation monad. ### 8.5.3 Deferring asynchronous computation enables composition In C# TAP, a function that returns a task begins execution immediately. This behavior of eagerly evaluating an asynchronous expression is called a *hot task,* which unfortunately has negative impact in its compositional form. The functional way of handling asynchronous operations is to defer execution until it’s needed, which has the benefit of enabling compositionality and provides finer control over the execution aspect. You have three options for implementing APM: * *Hot tasks* —The asynchronous method returns a task that represents an already running job that will eventually produce a value. This is the model used in C#. * *Cold tasks* —The asynchronous method returns a task that requires an explicit start from the caller. This model is often used in the traditional thread-based approach. * *Task generators* —The asynchronous method returns a task that will eventually generate a value, and that will start when a continuation is provided. This is the preferred way in functional paradigms because it avoids side effects and mutation. (This is the model used in F# to run asynchronous computations.) How can you evaluate an asynchronous operation on demand using the C# TAP model? You could use a `Lazy<T>` type as the wrapper for a `Task<T>` computation (see chapter 2), but a simpler solution is to wrap the asynchronous computation into a `Func<T>` delegate, which will run the underlying operation only when executed explicitly. In the following code snippet this concept is applied to the stock history example, which defines the `onDemand` function to lazily evaluate the `DownloadStockHistory` `Task` expression: ``` **Func**<Task<string>> onDemand = **async** () => **await** DownloadStockHistory("MSFT"); string stockHistory = await onDemand**()**; ``` From the point of view of the code, to consume the underlying `Task` of the `DownloadStockHistory` asynchronous expression, you need to treat and run the `onDemand` explicitly as a regular `Func`*with the `()`.* *Notice, there’s a small glitch in this code. The function `onDemand` runs the asynchronous expression, which must have a pre-fixed argument (in this case, `"MSFT")`. How can you pass a different stock symbol to the function? The solution is currying and partial application, FP techniques that allow easier reuse of more abstract functions because you get to specialize. (They are explained in appendix A.) Here’s the curried version of the `onDemand` function, which takes a string (symbol) as an argument that is then passed to the inner `Task` expression and returns a function of type `Func<Task<string>>`: ``` **Func<string, Func<Task<string>>>** onDemandDownload = **symbol** => **async** () => await DownloadStockHistoryAsync(**symbol**); ``` Now, this curried function can be partially applied to create *specialized* functions over a given string *(*in this case, a stock symbol*)*, which will be passed and consumed by the wrapped `Task` when the `onDemand` function is executed. Here’s the partially applied function to create the specialized `onDemandDownloadMSFT`: ``` Func<Task<string>> onDemandDownloadMSFT = onDemandDownload("MSFT"); string stockHistoryMSFT = await onDemandDownloadMSFT(); ``` The technique of differing asynchronous operations shows that you can build arbitrarily complex logic without executing anything until you decide to fire things off. ### 8.5.4 Retry if something goes wrong A common concern when working with asynchronous I/O operations, and, in particular, with network requests, is the occurrence of an unexpected factor that jeopardizes the success of the operations. In these situations, you may want to retry an operation if a previous attempt fails. During the HTTP request made by the method `DownloadStockHistory`, for example, there could be issues such as bad internet connections or unavailable remote servers. But these problems could be only a temporary state, and the same operation that fails an attempt once, might succeed if retried a few moments later. The pattern of having multiple retries is a common practice to recover from temporary problems. In the context of asynchronous operations, this model is achieved by creating a wrapper function, implemented with TAP and returning tasks. This changes the evaluation of an asynchronous expression, as shown in the previous section. Then, if there are a few problems, this function applies the retry logic for a specified number of times with a specified delay between attempts. This listing shows the implementation of the asynchronous `Retry` function as an extension method. Listing 8.10 `Retry` async operation ``` async Task<T> **Retry**<T>(Func<Task<T>> task, int retries, TimeSpan delay, CancellationToken cts = default(CancellationToken)) => ① **await** task().ContinueWith(async innerTask => { cts.ThrowIfCancellationRequested(); ② if (innerTask.Status != TaskStatus.Faulted) return innerTask.Result; ③ if (retries == 0) throw innerTask.Exception ?? throw new Exception(); ④ **await** Task.Delay(delay, cts); ⑤ return await Retry(task, retries - 1, delay, cts); ⑥ }).Unwrap(); ``` The first argument is the async operation that will be re-executed. This function is specified lazily, wrapping the execution into a `Func<>`, because invoking the operation starts the task immediately. In case of exceptions, the operation `Task<T>` captures error handling via the `Status` and `Exception` properties. It’s possible to ascertain if the async operation failed by inspecting these properties. If the operation fails, the `Retry` helper function waits for the specified interval, then retries the same operation, decreasing the number of retries until zero. With this `Retry<T>` helper function in place, the function `DownloadStockHistory` can be refactored to perform the web request operation with the retries logic: ``` async Task<Tuple<string, StockData[]>> ProcessStockHistory(string symbol) { string stockHistory = await **Retry**(() => DownloadStockHistory(symbol), **5,** **TimeSpan.FromSeconds(2))**; StockData[] stockData = await ConvertStockHistory(stockHistory); return Tuple.Create(symbol, stockData); } ``` In this case, the retry logic should run for at most five times with a delay of two seconds between attempts. The `Retry<T>` helper function should be typically attached to the end of a workflow. ### 8.5.5 Handling errors in asynchronous operations As you recall, the majority of asynchronous operations are I/O-bound; there’s a high probability that something will go wrong during their execution. The previous section covered the solution to handle failure by applying retry logic. Another approach is declaring a function combinator that links an async operation to a fallback one. If the first operation fails, then the fallback kicks in. It’s important to declare the fallback as a differed evaluated task. The following listing shows the code that defines the `Otherwise` combinator, which takes two tasks and falls back the execution to the second task if the first one completes unsuccessfully. Listing 8.11 Fallback `Task` combinator ``` static Task<T> **Otherwise**<T>(this Task<T> task, Func<Task<T>> otherTask) ① => task.ContinueWith(**async** innerTask => { if (innerTask.Status == TaskStatus.Faulted) return **await** orTask(); ② return innerTask.Result; }).Unwrap(); ``` When the task completes, the `Task` type has a concept of whether it finished successfully or failed. This is exposed by the `Status` property, which is equal to `TaskStatus.Faulted` when an exception is thrown during the execution of the `Task`. The stock history analysis example requires FP refactoring to apply the `Otherwise` combinator. Next is the code that combines the retry behavior, the `Otherwise` combinator, and the monadic operators for composing the asynchronous operations. Listing 8.12 `Otherwise` combinator applied to fallback behavior ``` Func<string, string> googleSourceUrl = (symbol) => ① $"http://www.google.com/finance/historical?q={symbol}&output=csv"; Func<string, string> yahooSourceUrl = (symbol) => ① $"http://ichart.finance.yahoo.com/table.csv?s={symbol}"; **async** Task<string> DownloadStockHistory(**Func<string, string>** sourceStock, string symbol) { string stockUrl = **sourceStock(symbol)**; ② var request = WebRequest.Create(stockUrl); using (var respone = **await** request.GetResponseAsync()) using (var reader = new StreamReader(respone.GetResponseStream())) return **await** reader.ReadToEndAsync; } **async** Task<Tuple<string, StockData[]>> ProcessStockHistory(string symbol) { **Func<Func<string, string>, Func<string, Task<string>>> downloadStock =** **service => stock => DownloadStockHistory(service, stock);** ③ Func<string, Task<string>> googleService = downloadStock(googleSourceUrl); ④ Func<string, Task<string>> yahooService = downloadStock(yahooSourceUrl); ④ return **await Otherwise**(() => googleService(symbol) ⑤ .**Retry**(()=> yahooService(symbol)), 5, TimeSpan.FromSeconds(2)) ⑥ **.Bind**(data => ConvertStockHistory(data)) ⑦ .**Map**(prices => Tuple.Create(symbol, prices)); ⑧ } ``` Note that the `ConfigureAwait` `Task` extension method has been omitted from the code. The application of the `Otherwise` combinator runs the function `DownloadStockHistory` for both the primary and the fallback asynchronous operations. The fallback strategy uses the same functionality to download the stock prices, with the web request pointing to a different service endpoint (URL). If the first service isn’t available, then the second one is used. The two endpoints are provided by the functions `googleSourceUrl` and `yahooSourceUrl`, which build the URL for the HTTP request. This approach requires a modification of the `DownloadStockHistory` function signature, which now takes the higher-order function `Func<string, string> sourceStock`. This function is partially applied against both the functions `googleSourceUrl` and `yahooSourceUrl`. The result is two new functions, `googleService` and `yahooService`, that are passed as arguments to the `Otherwise` combinator, which ultimately is wrapped into the `Retry` logic. The `Bind` and `Map` operators are then used to compose the operations as a workflow without leaving the `async Task` elevated world. All the operations are guaranteed to be fully asynchronous. ### 8.5.6 Asynchronous parallel processing of the historical stock market Because the function `Task` represents operations that take time, it’s logical that you’ll want to execute them in parallel when possible. One interesting aspect exists in the stock history code example. When the LINQ expression materializes, the asynchronous method `ProcessStockHistory` runs inside the `for-each` loop by calling one task at a time and awaiting the result. These calls are non-blocking, but the execution flow is sequential; each task waits for the previous one to complete before starting. This isn’t efficient. The following snippet shows the faulty behavior of running asynchronous operations sequentially using a `for-each` loop: ``` **async** Task `ProcessStockHistory`(string[] stockSymbols) { var sw = Stopwatch.StartNew(); **IEnumerable<Task<Tuple<string, StockData[]>>>** **stockHistoryTasks** = stockSymbols.Select(stock => **ProcessStockHistory**(stock)); var stockHistories = new List<Tuple<string, StockData[]>>(); **foreach (var stockTask in stockHistoryTasks)** stockHistories.Add(await stockTask); ShowChart(stockHistories, sw.ElapsedMilliseconds); } ``` Suppose this time you want to launch these computations in parallel and then render the chart once all is complete. This design is similar to the Fork/Join pattern. Here, multiple asynchronous executions will be spawned in parallel and wait for all to complete. Then the results will aggregate and continue with further processing. The following listing processes the stocks in parallel correctly. Listing 8.13 Running the stock history analysis in parallel ``` async Task `ProcessStockHistory(`) { var sw = Stopwatch.StartNew(); string[] stocks = new[] { "MSFT", "FB", "AAPL", "YHOO", "EBAY", "INTC", "GOOG", "ORCL" }; List<Task<Tuple<string, StockData[]>>> stockHistoryTasks = stocks.Select(**async** stock => **await** ProcessStockHistory(stock)).**ToList**(); ① Tuple<string, StockData[]>[] stockHistories = **await** **Task.WhenAll(stockHistoryTasks);** ② ShowChart(stockHistories, sw.ElapsedMilliseconds); } ``` In the listing, stock collection is transformed into a list of tasks using an asynchronous lambda in the `Select` method of LINQ. It’s important to materialize the LINQ expression by calling `ToList()`, which dispatches the tasks to run in parallel only once. This is possible due to the hot-task property, which means that a task runs immediately after its definition. The method `Task.WhenAll` (similar to `Async.Parallel` in F#) is part of the TPL, and its purpose is to combine the results of a set of tasks into a single task array, then wait asynchronously for all to complete: ``` Tuple<string, StockData[]>[] result = **await Task.WhenAll**(stockHistoryTasks); ``` In this instance, the execution time drops to 0.534 sec from the previous 4.272 sec. ### 8.5.7 Asynchronous stock market parallel processing as tasks complete An alternative (and better) solution processes each stock history result as it arrives, instead of waiting for the download of all stocks to complete. This is a good pattern for performance improvement. In this case, it also reduces the payload for the UI thread by rendering the data in chunks. Consider the stock market analysis code, where multiple pieces of historical data are downloaded from the web and then used to process an image to render to a UI control. If you wait for all the data to be analyzed before updating the UI, the program is forced to process sequentially on the UI thread. A more performant solution, shown next, is to process and update the chart as concurrently as possible. Technically, this pattern is called *interleaving*. The important code to note is in bold. Listing 8.14 Stock history analysis processing as each `Task` completes ``` async Task ProcessStockHistory() { var sw = Stopwatch.StartNew(); string[] stocks = new[] { "MSFT", "FB", "AAPL", "YHOO", "EBAY", "INTC", "GOOG", "ORCL" }; List<Task<Tuple<string, StockData[]>>> stockHistoryTasks = stocks.Select(**ProcessStockHistory**).ToList(); ① **while (stockHistoryTasks.Count > 0)** ② { Task<Tuple<string, StockData[]>> stockHistoryTask = **await Task.WhenAny(stockHistoryTasks)**; ③ **stockHistoryTasks.Remove(stockHistoryTask)**; ④ Tuple<string, StockData[]> stockHistory = **await** stockHistoryTask; **ShowChartProgressive**(stockHistory); ⑤ } } ``` The code made two changes from the previous version: * A `while` loop removes the tasks as they arrive, until the last one. * `Task.WhenAll` is replaced with `Task.WhenAny`. This method waits asynchronously for the first task that reaches a terminal state and returns its instance. This implementation doesn’t consider either exceptions or cancellations. Alternatively, you could check the status of the task `stockHistoryTask` before further processing to apply conditional logic. ## Summary * You can write asynchronous programs in .NET with Task-based Asynchronous Programming (TAP) in C#, which is the preferred model to use. * The asynchronous programming model lets you deal effectively with massive concurrent I/O operations by intelligently recycling resources during their idle time and by avoiding the creation of new resources, thereby optimizing memory consumption and enhancing performance. * The `Task<T>` type is a monadic data structure, which means, among other things, that it can easily be composed with other tasks in a declarative and effortless way. * Asynchronous tasks can be performed and composed using monadic operators, which leads to LINQ-style semantics. This has the advantage of providing a clear and fluid declarative programming style. * Executing relatively long-lasting operations using asynchronous tasks can increase the performance and responsiveness of your application, especially if it relies on one or more remote services. * The number of asynchronous computations that can run in parallel simultaneously is unrelated to the number of CPUs available, and execution time depends on the period spent waiting for the I/O operations to complete, bound only by the I/O drivers. * TAP is based on the task type, enriched with the `async` and `await` keywords. This asynchronous programming model embraces the functional paradigm in the form of using continuation-passing style (CPS). * With TAP, you can easily implement efficient patterns, such as downloading parallel multiple resources and processes as soon as they are available, instead of waiting for all resources to download.** **# 9 Asynchronous functional programming in F# **This chapter covers** * Making asynchronous computations cooperate * Implementing asynchronous operations in a functional style * Extending asynchronous workflow computational expressions * Taming parallelism with asynchronous operations * Coordinating cancellation of parallel asynchronous computations In chapter 8, I introduced asynchronous programming as `Task`s executing independently from the main application thread, possibly in a separated environment or across the network on different CPUs. This method leads to parallelism, where applications can perform an inordinately high number of I/O operations on a single-core machine. This is a powerful idea in terms of program execution and data throughput speed, casting away the traditional step-by-step programming approach. Both the F# and C# programming languages provide a slightly different, yet elegant, abstraction for expressing asynchronous computations, making them ideal tools, well suited for modeling real-world problems. In chapter 8, you saw how to use the asynchronous programming model in C#. In this chapter, we look at how to do the same in F#. This chapter helps you understand the performance semantics of the F# asynchronous workflow so you can write efficient and performant programs for processing I/O-bound operations. I’ll discuss the F# approach and analyze it for its unique traits and how they impact code design and explain how to easily implement and compose effective asynchronous operations in a functional style. I’ll also teach you how to write non-blocking I/O operations to increase the overall execution, efficiency, and throughput of your applications when running multiple asynchronous operations concurrently, all without worrying about hardware constraints. You’ll see firsthand how to apply functional concepts for writing asynchronous computations. Then you’ll evaluate how to use these concepts to handle side effects and interact with the real world without compromising the benefits of the compositional semantics—keeping your code concise, clear, and maintainable. By the end of this chapter, you’ll come away with an appreciation of how modern applications must exploit parallelism and harness the power of multicore CPUs to run efficiently and to handle a large number of operations in a functional way. ## 9.1 Asynchronous functional aspects An *asynchronous function* is a design idiom where a normal F# function or method returns an asynchronous computation. Modern asynchronous programming models such as the F# asynchronous workflow and C# `async/await` are functional because applying functional programming enables the experienced programmer to write simple and declarative procedural code that runs asynchronously and in parallel. From the start, F# introduced support for the initiation of an asynchronous programming semantic definition that resembled synchronous code. It’s not a coincidence that C#, which has introduced several functional futures in its language, has been inspired by the functional approach of the F# asynchronous workflow to implement the `async`/`await` asynchronous model, replacing the conventional imperative APM. Moreover, both the C# asynchronous task and the F# asynchronous workflow are monadic containers, which eases factoring out common functionality into generic, reusable components. ## 9.2 What’s the F# asynchronous workflow? The FP language F# provides full support for asynchronous programming: * It integrates with the asynchronous programming model provided by .NET. * It offers an idiomatic functional implementation of APM. * It supports interoperability with the task-based programming model in C#. The asynchronous workflow in F# is designed to satisfy the functional paradigm promoting compositionality, simplicity, and expressing non-blocking computations by keeping the sequential structure of code. By definition, the asynchronous workflow is built on computation expressions, a generic component of the F# core language that provides monadic semantics to express a sequence of operations in continuation-passing style (CPS). A key feature of the asynchronous workflow is combining non-blocking computations with lightweight asynchronous semantics, which resembles a linear control flow. ### 9.2.1 The continuation passing style in computation expressions Multithreaded code is notoriously resistant to the imperative style of writing. But using CPS, you can embrace the functional paradigm to make your code remarkably concise and easy to write. Let’s imagine that you’re programming using an old version of .NET Framework that doesn’t have the `async`/`await` programming model available (see chapter 8). In this case you need to compute a series of `Task` operations, where the input of each operation depends on the output of the previous one; the code can become complex and convoluted. In the following code example, the code downloads an image from Azure Blob storage and saves the bytes into a file. For the sake of simplicity, the code that isn’t relevant for the example is omitted intentionally; the code to note is in bold. You can find the full implementation in the downloadable source code: ``` let downloadCloudMediaBad destinationPath (imageReference : string) = log "Creating connecton..." let taskContainer = **Task.Run**<CloudBlobContainer>(fun () -> ➥ getCloudBlobContainer()) log "Get blob reference..."; let container = taskContainer.**Result** let taskBlockBlob = **Task.Run**<CloudBlob>(fun () -> ➥ container.GetBlobReference(imageReference)) log "Download data..." let blockBlob = taskBlockBlob.**Result** let bytes = Array.zeroCreate<byte> (int blockBlob.Properties.Length) let taskData = **Task.Run**<byte[]>(fun () -> blockBlob.DownloadToByteArray(bytes, 0)|>ignore; bytes) log "Saving data..." let data = taskData.**Result** let taskComplete = **Task.Run**(fun () -> ➥ File.WriteAllBytes(Path.Combine(destinationPath,imageReference), data)) taskComplete.**Wait**() log "Complete" ``` Granted, the code is an extreme example that aims to validate the point that using traditional tools (with the same obsolete mindset) to write concurrent code produces verbose and impractical programs. The inexperienced developer can write code in this way more easily, because it’s easier to reason sequentially. The result, however, is a program that doesn’t scale, and each `Task` computation calls the instance method `Result, which is a bad practice. In this situation and with a little study, CPS can solve the problem of scalability. First, you define a function used to combine the operations in a pipeline shape:` ```` ``` let **bind**(operation:unit -> 'a, continuation:'a -> unit) = Task.Run(fun () -> continuation(operation())) |> ignore ``` The `bind` function accepts the continuation (`'a -> unit`) function, which is called when the result of the operation (`unit -> 'a`) is ready. The main key is that you’re not blocking the calling thread, which may then continue executing useful code. When the result is ready, the continuation is called, allowing the computation to continue. You can now use this `bind` function to rewrite the previous code in a fluent manner: ``` let downloadCloudMediaAsync destinationPath (imageReference : string) = bind( (fun () -> log "Creating connecton..."; getCloudBlobContainer()), fun connection -> **bind**( (fun () -> log "Get blob reference..."; connection.GetBlobReference(imageReference)), fun blockBlob -> **bind**( (fun () -> log "Download data..." let bytes = Array.zeroCreate<byte> (int blockBlob.Properties. ➥ Length) blockBlob.DownloadToByteArray(bytes, 0) |> ignore bytes), fun bytes -> **bind**( (fun () -> log "Saving data..."; File.WriteAllBytes(Path.Combine(destinationPath,imageReference), ➥ bytes)), fun () -> log "Complete")))) ["Bugghina01.jpg"; "Bugghina02.jpg"; "Bugghina003.jpg"] |> Seq.iter (downloadCloudMediaAsync "Images") ``` Running the code, you’ll notice the `bind` function executes the underlying anonymous lambda in its own thread. Every time the `bind` function is called, a thread is pulled out from the thread pool, then, when the function completes, the thread is released back to the thread pool. The F# asynchronous workflow is based on this same concept of CPS, which is useful for modeling calculations that are difficult to capture sequentially. Figure 9.1 shows the comparison between incoming requests handled in a synchronous and asynchronous way.  Figure 9.1 Comparison between synchronous (blocking) I/O and asynchronous (non-blocking) I/O operation systems. The synchronous version can send only one request at a time; after the request is processed, the result is sent back to the caller. The asynchronous version can send many concurrent requests simultaneously; after these requests are processed concurrently on the server side, they’re sent back to the caller in the order that they complete. The F# asynchronous workflow also includes cancellation and exception continuations. Before we dig into the asynchronous workflow details, let’s look at an example. ### 9.2.2 The asynchronous workflow in action: Azure Blob storage parallel operations Let’s imagine that your boss has decided that the company’s digital media assets should be stored in the cloud as well as locally. He asks you to create a simple uploader/downloader tool for that purpose and to synchronize and verify what’s new in the cloud. To handle media files as binary data for this scenario, you design a program to download a set of images from the network Azure Blob storage and render these images in a client-side application that’s based on WPF. Azure Blob storage ([`mng.bz/X1FB`](http://mng.bz/X1FB)) is a Microsoft cloud service that stores unstructured data in the form of blobs (binary large objects). This service stores any type of data, which makes it a great fit to handle your company’s media files as binary data (figure 9.2).  Figure 9.2 The synchronous versus asynchronous programming model. The synchronous program executes each operation sequentially one at a time. The asynchronous version can run multiple requests in parallel, increasing the overall execution speed of the program. As a result, the asynchronous version of the program can download more images in the same period of time as compared to the synchronous version. As mentioned earlier, to provide visual feedback, the program runs as a client WPF application. This application benefits from a `FileSystemWatcher` ([`mng.bz/DcRT`](http://mng.bz/DcRT)) that’s listening for file-created events to pick up file changes in the local folder. When the images are downloaded and saved in this local folder, `FileSystemWatcher` triggers an event and synchronizes the updates of a local file collection with the path of the image, which is successively displayed in a WPF UI controller. (The code implementation of the client WPF UI application isn’t reviewed here because it’s irrelevant to the main topic of this chapter.) Let’s compare the synchronous and asynchronous programs from figure 9.2. The synchronous version of the program executes each step sequentially and iterates, with a conventional `for` loop, the collection of images to download from the Azure Blob storage. This design is straightforward but doesn’t scale. Alternatively, the asynchronous version of the program is capable of processing multiple requests in parallel, which increases the number of images downloaded in the same period of time. Let’s analyze the asynchronous version of the program in more depth. In figure 9.3, the program starts by sending a request to the Azure Blob storage to open the cloud blob container connection. When the connection is opened, the handle of the blob media stream is retrieved to begin downloading the image. The data is read from the stream, and, ultimately, persisted to a local filesystem. Then it repeats this operation for the next image through to the last.  Figure 9.3 Downloading an image asynchronously from the network (Azure Blob storage) Each download operation takes an average of 0.89 seconds over five runs, for a total time of 89.28 seconds to download 100 images. These values can vary according the network bandwidth. Obviously, the time to perform multiple synchronous I/O operations sequentially is equal to the sum of the time elapsed for each individual operation, in comparison to the asynchronous approach, which by running in parallel has an overall response time equal to the slowest operation. The following listing is the asynchronous workflow implementation of the program to download the images asynchronously from Azure Blob storage (the code to note is in bold). Listing 9.1 Asynchronous workflow implementation to download images ``` let getCloudBlobContainerAsync() : **Async<CloudBlobContainer>** = **async** { let storageAccount = CloudStorageAccount.Parse(azureConnection) ① let blobClient = storageAccount.CreateCloudBlobClient() ② let container = blobClient.GetContainerReference("media") ③ let! _ = container.CreateIfNotExists**Async**() ④ return container } let downloadMediaAsync(blobNameSource:string) (fileNameDestination:string)= **async** { ⑤ let**!** container = getCloudBlobContainer**Async**() ⑥ let blockBlob = container.GetBlockBlobReference(blobNameSource) let**!** (blobStream : Stream) = blockBlob.**OpenReadAsync**() ⑥ use fileStream = new FileStream(fileNameDestination, FileMode.Create, ➥ FileAccess.Write, FileShare.None, 0x1000, FileOptions.Asynchronous) let buffer = Array.zeroCreate<byte> (int blockBlob.Properties.Length) let rec copyStream bytesRead = async { match bytesRead with | 0 -> fileStream.Close(); blobStream.Close() | n -> do**!** fileStream.**AsyncWrite**(buffer, 0, n) ⑥ let**!** bytesRead = blobStream.**AsyncRead**(buffer, 0, buffer. ➥ Length) return! copyStream bytesRead } let**!** bytesRead = blobStream.**AsyncRead**(buffer, 0, buffer.Length) ⑥ do**!** copyStream bytesRead } ``` Note that this code looks almost exactly like sequential code. The parts in bold are the only changes necessary to switch the code from synchronous to asynchronous. The intentions of this code are direct and simple to interpret because of the sequential structure of the code. This code simplification is the result of the pattern-based approach that the F# compiler uses to detect a computation expression, and in the case of an asynchronous workflow, it gives the illusion to the developer that callbacks have disappeared. Without callbacks, the program isn’t subject to inversion of control as in APM, which makes F# deliver a clean asynchronous code implementation with a focus on compositionality. Both the `getCloudBlobContainerAsync` and `downloadMediaAsync` functions are wrapped inside an `async` expression (workflow declaration), which turns the code into a block that can be run asynchronously. The function `getCloudBlobContainerAsync` creates a reference to the container `media`. The return type of this asynchronous operation to identify the container is type `Task<CloudBlobContainer>`, which with the `Async<CloudBlobContainer>` is handled by the underlying asynchronous workflow expression (explained later in the chapter). The key feature of an asynchronous workflow is to combine non-blocking computations with lightweight asynchronous semantics, which resembles a linear control flow. It simplifies the program structure of traditional callback-based asynchronous programming through syntactic sugar. The methods that run asynchronously are bound to a different construct that uses the `!` (pronounced *bang*) operator, which is the essence of an asynchronous workflow because it notifies the F# compiler to interpret the function in an exclusive way. The body of a `let!` binding registers the expression as a callback, in context for future evaluation to the rest of the asynchronous workflow, and it extracts the underlying result from `Async<'T>`. In the expression ``` let**!** bytesRead = blobStream.**AsyncRead**(buffer, 0, buffer.Length) ``` the return type of `blobStream.AsyncRead` is `Async<int>`, indicating the number of bytes read from the asynchronous operation, which is extracted into the value `bytesRead`. The `rec copyStream` function recursively and asynchronously copies the `blobStream` into the `fileStream`. Note the `copyStream` function is defined inside another async workflow to capture (close over) the stream values that can be accessed to be copied. This code could be rewritten in an imperative style with identical behavior as follows: ``` let! bytesRead = blobStream.**AsyncRead**(buffer, 0, buffer.Length) let mutable bytesRead = bytesRead while bytesRead > 0 do do! fileStream.AsyncWrite(buffer, 0, bytesRead) let! bytesReadTemp = blobStream.AsyncRead(buffer, 0, buffer.Length) bytesRead <- bytesReadTemp fileStream.Close(); blobStream.Close() ``` The mutation of the variable `bytesRead` is encapsulated and isolated inside the main function `downloadMediaAsync` and is thread safe. Besides `let!` the other asynchronous workflow constructors are as follows: * `use!`—Works like `let!` for disposable resources that are cleaned up when out of scope * `do!`—Binds an asynchronous workflow when the type is `Async<unit>` * `return`—Returns a result from the expression * `return!`—Executes the bound asynchronous workflow, returning the value of the expression The F# asynchronous workflow is based on the polymorphic data type `Async<'a>` that denotes an arbitrary asynchronous computation, which will materialize in the future, returning a value of type `'a`. This concept is similar to the C# TAP model. The main difference is that the F# `Async<`'`a>` type isn’t hot, which means that it requires an explicit command to start the operation. When the asynchronous workflow reaches the start primitive, a callback is scheduled in the system, and the execution thread is released. Then, when the asynchronous operation completes the evaluation, the underlying mechanisms will notify the workflow, passing the result to the next step in the code flow. The real magic is that the asynchronous workflow will complete at a later time, but you don’t have to worry about waiting for the result because it will be passed as an argument in the continuation function when completed. The compiler takes care of all of this, organically converting the `Bind` member calls into the continuation constructs. This mechanism uses CPS for writing, implicitly, a structured callback-based program inside its body expression, which allows a linear style of coding over a sequence of operations. The asynchronous execution model is all about continuations, where the evaluation of the asynchronous expression preserves the capability of having a function registered as a callback (figure 9.4).  Figure 9.4 A comparison of the `Bind` function with the computation expression version. The benefits of using an asynchronous workflow are as follows: * Code that looks sequential but behaves asynchronously * Simple code that’s easy to reason about (because it looks like sequential code), which simplifies updates and modification * Asynchronous compositional semantics * Built-in cancellation support * Simple error handling * Easy to parallelize ## 9.3 Asynchronous computation expressions *Computation* *expressions* are an F# feature that define a polymorphic construct used to customize the specification and behavior of the code, and lead you toward a compositional programming style. The MSDN online documentation provides an excellent definition: > Computation expressions in F# provide a convenient syntax for writing computations that can be sequenced and combined using control flow constructs and bindings. They can be used to provide a convenient syntax for monads, a functional programming feature that can be used to manage data, control, and side effects in functional programs.^(1) Computation expressions are a helpful mechanism for writing computations that execute a controlled series of expressions as an evaluation of feed-into steps. The first step serves as input to the second step, and that output serves as input for the third step, and so forth through the execution chain—unless an exception occurs, in which case the evaluation terminates prematurely, skipping the remaining steps. Think of a computation expression as an extension of the programming language because it lets you customize a specialized computation to reduce redundant code and apply heavy lifting behind the scenes to reduce complexity. You can use a computation expression to inject extra code during each step of the computation to perform operations such as automatic logging, validation, control of state, and so on. The F# asynchronous programming model, asynchronous workflow, relies on computation expressions, which are also used to define other implementations, such as sequence and query expressions. The F# asynchronous workflow pattern is syntactic sugar, interpreted by the compiler in a computation expression. In an asynchronous workflow, the compiler must be instructed to interpret the workflow expression as an asynchronous computation. The notification is semantically passed by wrapping the expression in an asynchronous block, which is written using curly braces and the `async` identifier right at the beginning of the block, like so: `async { expression }` When the F# compiler interprets a computation as an asynchronous workflow, it divides the whole expression into separate parts between the asynchronous calls. This transformation, referred to as *desugaring*, is based on the constituent primitives by the computation builder in context (in this case, the asynchronous workflow)*.* F# supports computation expressions through a special type called `builder`, associated with the conventional monadic syntax. As you remember, the two primary monadic operators to define a computation builder are `Bind` and `Return`. In the case of an asynchronous workflow, the generic monadic type is replaced and defined with the specialized type `Async`: ``` async.**Bind**: Async<'T> → ('T → Async<'R>) → Async<'R> ① async.**Return**: 'T → Async<'T> ② ``` The asynchronous workflow hides nonstandard operations in the form of computation builder primitives and reconstructs the rest of the computation in continuation. Nonstandard operations are bound in the body expression of the builder constructs with the `!` operator. It’s not a coincidence that the computation expression definition, through the `Bind` and `Return` operators, is identical to the monadic definition, which shares the same monadic operators. You can think of a computation expression as a continuation monad pattern. ### 9.3.1 Difference between computation expressions and monads You can also think of a computation expression as a general monadic syntax for F#, which is closely related to monads. The main difference between computation expressions and monads is found in their origin. Monads strictly represent mathematical abstractions, whereas the F# computation expression is a language feature that provides a toolset to a program with computation that can—or not—have a monadic structure. F# doesn’t support type classes, so it isn’t possible to write a computation expression that’s polymorphic over the type of computation. In F# you can select a computation expression with the most specialized behavior and convenient syntax (an example is coming). The code written using the computation expression pattern is ultimately translated into an expression that uses the underlying primitives implemented by the computation builder in context. This concept will be clearer with an example. Listing 9.2 shows the desugared version of the function `downloadMediaAsync`, where the compiler translates the computation expression into a chain of method calls. This unwrapped code shows how the behavior of each single asynchronous part is encapsulated in the related primitive member of the computation builder. The keyword `async` tells the F# compiler to instantiate the `AsyncBuilder`, which implements the essential asynchronous workflow members `Bind`, `Return`, `Using`, `Combine`, and so on. The listing shows how the compiler translates the computation expression into a chain of method calls of the code from Listing 9.1. (The code to note is in bold.) Listing 9.2 Desugared `DownloadMediaAsync` computation expression ``` let downloadMediaAsync(blobName:string) (fileNameDestination:string) = **async.Delay**(fun() -> ① **async.Bind**(getCloudBlobContainerAsync(), fun container -> ② let blockBlob = container.GetBlockBlobReference(blobName) **async.Using**(blockBlob.OpenReadAsync(), fun (blobStream:Stream) -> ③ let sizeBlob = int blockBlob.Properties.Length **async.Bind**(blobStream.AsyncRead(sizeBlob), fun bytes -> use fileStream = new FileStream(fileNameDestination, ➥ FileMode.Create, FileAccess.Write, FileShare.None, bufferSize, ➥ FileOptions.Asynchronous) **async.Bind**(fileStream.AsyncWrite(bytes, 0, bytes.Length), fun () -> fileStream.Close() blobStream.Close() async.Return()))))) ④ ``` In the code, the compiler transforms the `let!` binding construct into a call to the `Bind` operation, which unwraps the value from the computation type and executes the rest of the computation converted to a continuation. The `Using` operation handles computation where the resulting value type represents a resource that can be disposed. The first member in the chain, `Delay`, wraps the expression as a whole to manage the execution, which can run later on demand. Each step of the computation follows the same pattern: the computation builder member, like `Bind` or `Using`, starts the operation and provides the continuation that runs when the operation completes, so you don’t wait for the result. ### 9.3.2 AsyncRetry: building your own computation expression As mentioned, a computation expression is a pattern-based interpretation (like LINQ/PLINQ), which means that the compiler can infer from the implementation of the members `Bind` and `Return` that the type construct is a monadic expression. By following a few simple specifications, you can build your own computation expression, or even extend an existing one, to deliver to an expression the special connotation and behavior you want. Computation expressions can contain numerous standard language constructs, as listed in table 9.1; but the majority of these member definitions are optional and can be used according to your implementation needs. The mandatory and basic members to represent a valid computation expression for the compiler are `Bind` and `Return`. Table 9.1. Computation expression operators | **Member** | **Description** | | --- | --- | | `Bind` `: M<'a> * ('a ➔ M<'b>) ➔ M<'b>` | Transformed `let!` and `do!` within computation expressions. | | `Return` `: 'a ➔ M<'a>` | Transformed `return` within computation expressions. | | `Delay` `: (unit ➔ M<'a>) ➔ M<'a>` | Used to ensure side effects within a computation expression are performed when expected. | | `Yield` `: 'a ➔ M<'a>` | Transformed `yield` within computation expressions. | | `For` `: seq<'a> * ('a ➔ M<'b>) ➔ M<'b>` | Transformed `for ... do ...` within computation expressions. `M<'b>` can optionally be `M<unit>`. | | `While` `: (unit ➔ bool) * M<'a> ➔ M<'a>` | Transformed `while-do` block within computation expressions. `M<'b>` can optionally be `M<unit>`. | | `Using` `: 'a * ('a ➔ M<'b>) ➔ M<'b> when 'a :> IDisposable` | Transformed `use` bindings within computation expressions. | | `Combine` `: M<'a> ➔ M<'a> ➔ M<'a>` | Transformed sequencing within computation expressions. The first `M<'a>` can optionally be `M<unit>`. | | `Zero` `: unit ➔ M<'a>` | Transformed empty `else` branches of `if`/`then` within computation expressions. | | `TryWith` `: M<'a> ➔ M<'a> ➔ M<'a>` | Transformed empty `try`/`with` bindings within computation expressions. | | `TryFinally` `: M<'a> ➔ M<'a> ➔ M<'a>` | Transformed `try`/`finally` bindings within computation expressions. | Let’s build a computation expression that can be used with the example in Listing 9.2. The first step of the function, `downloadMediaCompAsync`, connects asynchronously to the Azure Blob service, but what happens if the connection drops? An error is thrown and the computation stops. You could check whether the client is online before trying to connect; but it’s a general rule of thumb when working with network operations that you retry the connection a few times before aborting. In the following listing, you’re building a computation expression that runs an asynchronous operation successfully a few times, with a delay in milliseconds between each retry before the operation stops (the code to note is in bold). Listing 9.3 `AsyncRetry``Builder` computation expression ``` type AsyncRetryBuilder(max, sleepMilliseconds : int) = let rec retry n (task:Async<'a>) (continuation:'a -> Async<'b>) = **async** { try let! result = task ① let! conResult = continuation result ② return conResult with error -> if n = 0 then return raise error ③ else do! Async.Sleep sleepMilliseconds ④ return! **retry (n - 1) task continuation** } member x.**ReturnFrom**(f) = f ⑤ member x.**Return**(v) = async { return v } ⑥ member x.**Delay**(f) = async { return! f() } ⑦ member x.**Bind**(task:Async<'a>, continuation:'a -> Async<'b>) = retry max task continuation ⑧ member x.Bind(t : Task, f : unit -> Async<'R>) : Async<'R> = async.Bind(Async.AwaitTask t, f) ⑨ ``` The `AsyncRetryBuilder` is a computation builder used to identify the value to construct the computation. The following code shows how to use the computation builder (the code to note is highlighted in bold). Listing 9.4 Using `Async``Retry``Builder` to identify construct value ``` let retry = **Async****Retry****Builder(3, 250)** ① let downloadMediaCompAsync(blobNameSource:string) (fileNameDestination:string) = **async** { let! container = **retry** { ② return! getCloudBlobContainerAsync() } ... Rest of the code as before ``` The `AsyncRetryBuilder` instance `retry` re-attempts to run the code in case of an exception three times, with a delay of 250 ms between retries**. Now, the `AsyncRetryBuilder` computation expression can be used in combination with the asynchronous workflow, to run and retry asynchronously (in case of failure), the `downloadMediaCompAsync` operation. It’s common to create a global value identifier for a computation expression that can be reused in different parts of your program. For example, the asynchronous workflow and sequence expression can be accessed anywhere in the code without creating a new value.** **### 9.3.3 Extending the asynchronous workflow Besides creating custom computation expressions, the F# compiler lets you extend existing ones. The asynchronous workflow*is a perfect example of a computation expression that can be enhanced. In Listing 9.4, the connection to the Azure Blob container is established through the asynchronous operation `getCloudBlobContainerAsync`, the implementation of which is shown here:* *``` let getCloudBlobContainerAsync() : Async<CloudBlobContainer> = async { let storageAccount = CloudStorageAccount.Parse(azureConnection) let blobClient = storageAccount.CreateCloudBlobClient() let container = blobClient.GetContainerReference("media") let! _ = container.**CreateIfNotExistsAsync**() return container } ``` Inside the body of the `getCloudBlobContainerAsync` function, the `CreateIfNotExistsAsync` operation returns a `Task` type, which isn’t friendly to use in the context of asynchronous workflow. Fortunately, the F# async provides the `Async.AwaitTask`^(2) operator, which allows a `Task` operation to be awaited and treated as an F# async computation. A vast number of asynchronous operations in .NET have return types of `Task` or the generic version `Task<'T>`. These operations, designed to work primarily with C#, aren’t compatible with the F# out-of-the-box asynchronous*computation expressions.* *What’s the solution? Extend the computation expression. Listing 9.5 generalizes the F# asynchronous workflow model so that it can be used not only in async operations, but also with the `Task` and `Observable` types. The async computation expression needs type-constructs that can create observables and tasks, as opposed to only asynchronous workflows. It’s possible to await all kinds of events produced by `Event` or `IObservable` streams and tasks from `Task` operations. These extensions for the computation expression, as you can see, abstract the use of the `Async.AwaitTask` operator (the related commands are in bold). Listing 9.5 Extending the asynchronous workflow to support `Task<'``a>` ``` type Microsoft.FSharp.Control.**AsyncBuilder** with member x.Bind(t:**Task**<'T>, f:'T -> Async<'R>) : Async<'R> = ➥ async.Bind(Async.AwaitTask t, f) ① member x.Bind(t:**Task**, f:unit -> Async<'R>) : Async<'R> = ➥ async.Bind(Async.AwaitTask t, f) ① member x.Bind (m:'a **IObservable**, f:'a -> 'b Async) = ➥ async.Bind(Async.AwaitObservable m, f) ① member x.ReturnFrom(computation:**Task**<'T>) = ➥ x.ReturnFrom(Async.AwaitTask computation) ``` The `AsyncBuilder` lets you inject functions to extend the manipulation on other wrapper types, such as `Task` and `Observable`, whereas the `Bind` function in the extension lets you fetch the inner value contained in the `Observable` (or `IEvent`) using the `let!` and `do!` operators. This technique removes the need for adjunctive functions like `Async.AwaitEvent` and `Async.AwaitTask`. In the first line of code, the compiler is notified to target the `AsyncBuilder`, which manages the asynchronous computation expression transformation. The compiler, after this extension, can determine which `Bind` operation to use, according to the expression signature registered through the `let!` binding. Now you can use the asynchronous operation of type `Task` and `Observable` in an asynchronous workflow. ### 9.3.4 Mapping asynchronous operation: the Async.map functor Let’s continue extending the capabilities of the F# asynchronous workflow. The F# asynchronous workflow provides a rich set of operators; but currently, there’s no built-in support for an `Async.map` function(also known as a *functor*) having type signature ``` ('a ➔ 'b) ➔ Async<'a> ➔ Async<'b> ``` A functor is a pattern of mapping over structure, which is achieved by providing implementation support for a two-parameter function called `map` (better known as `fmap)`. For example, the `Select` operator in LINQ/PLINQ is a functor for the `IEnumerable` elevated type. Mainly, functors are used in C# to implement LINQ-style fluent APIs that are used also for types (or contexts) other than collections. We discussed the functor type in chapter 7, where you learned how to implement a functor (in bold) for the `Task` elevated type: ``` Task<T> **fmap**<T, R>(this Task<T> input, Func<T, R> map) => input.ContinueWith(t => f(t.Result)); ``` This function has a signature `('T ➔ 'R) ➔ Task<T> ➔ Task<R>`, so it takes a map function `'T ➔ 'R` as a first input (which means it goes from a value type `T` to a value type `R`, in C# code `Func<T, R>`), and then upgrades type `Task<'T>` as a second input and returns a `Task<'R>`. Applying this pattern to the F# asynchronous workflow, the signature of the `Async.map` function is ``` ('a -> 'b) -> Async<'a> -> Async<'b> ``` The first argument is a function `'a -> 'b`, the second is an `Async<'a>`, and the output is an `Async<'b>`. Here’s the implementation of `Async.map`: ``` module Async = let inline map (func:'a -> 'b) (operation:Async<'a>) = async { let! result = operation return func result } ``` `let! result = operation` runs the asynchronous operation and unwraps the `Async<'a>` type, returning the `'a` type. Then, we can pass the value `'a` to the `function func:'a -> 'b` that converts `'a` to `'b`. Ultimately, once the value `'b` is computed, the `return` operator wraps the result `'b` into the `Async<>` type. The `map` function applies an operation to the objects inside the `Async` container,^(3) returning a container of the same shape. The `Async.map` function is interpreted as a two-argument function where a value is wrapped in the F# `Async` context, and a function is applied to it. The F# `Async` type is added to both its input and output. The main purpose of the `Async.map` function is to operate (project) the result of an `Async` computation without leaving the context. Back to the Azure Blob storage example, you can use the `Async.map` function to download and transform an image as follows (the code to note is in bold): ``` let downloadBitmapAsync(blobNameSource:string) = async { let! token = Async.CancellationToken let! container = getCloudBlobContainerAsync() let blockBlob = container.GetBlockBlobReference(blobNameSource) use! (blobStream : Stream) = blockBlob.OpenReadAsync() return Bitmap.FromStream(blobStream) } let transformImage (blobNameSource:string) = downloadBitmapAsync(blobNameSource) |> **Async.map** ImageHelpers.setGrayscale |> **Async.map** ImageHelpers.createThumbnail ``` The `Async.map` function composes the async operations of downloading the image `blobNameSource` from the Azure Table storage with the transformation functions `setGrayscale` and `createThumbnail`. In the snippet, the advantages of using the `Async.map` function are composability and continued encapsulation. ### 9.3.5 Parallelize asynchronous workflows: Async.Parallel Let’s return to the example of downloading 100 images from Azure Blob storage using the F# asynchronous workflow. In section 9.2 you built the function `downloadMediaAsync` that downloads one cloud blob image using the asynchronous workflow. It’s time to connect the dots and run the code. But instead of iterating through the list of images one operation at a time, the F# asynchronous workflow provides an elegant alternative: `Async.Parallel`. The idea is to compose all the asynchronous computations and execute them all at once. Parallel composition of asynchronous computations is efficient because of the scalability properties of the .NET thread pool and the controlled, overlapped execution of operations such as web requests by modern operating systems. Using the F# `Async.Parallel` function, it’s possible to download hundreds of images in parallel (the code to note is in bold). Listing 9.6 `Async.Parallel` downloading all images in parallel ``` let retry = **RetryAsyncBuilder**(3, 250) ① let downloadMediaCompAsync (container:CloudBlobContainer) (blobMedia:IListBlobItem) = **retry** { ② let blobName = blobMedia.Uri.Segments.[blobMedia.Uri.Segments.Length-1] let blockBlob = container.GetBlockBlobReference(blobName) let! (blobStream : Stream) = blockBlob.OpenReadAsync() return Bitmap.FromStream(blobStream) ③ } let transformAndSaveImage (container:CloudBlobContainer) (blobMedia:IListBlobItem) = downloadMediaCompAsync container blobMedia |> Async.**map** ImageHelpers.setGrayscale ④ |> Async.**map** ImageHelpers.createThumbnail ④ |> Async.**tap** (fun image -> ⑤ let mediaName = blobMedia.Uri.Segments.[blobMedia.Uri.Segments.Length - 1] image.Save(mediaName)) let downloadMediaCompAsyncParallel() = **retry** { ② let! container = getCloudBlobContainerAsync() ⑥ let computations = container.ListBlobs() ⑦ |> Seq.map(transformAndSaveImage container) ⑧ return! Async.**Parallel** computations } ⑨ let cancelOperation() = downloadMediaCompAsyncParallel() |> Async.**StartCancelable** ⑩ ``` The `Async.Parallel` function takes an arbitrary collection of asynchronous operations and returns a single asynchronous workflow that will run all the computations in parallel, waiting for all of them to complete. The `Async.Parallel` function coordinates the work with the thread pool scheduler to maximize resource employment using a Fork/Join pattern, resulting in a performance boost. The library function `Async.Parallel` takes a list of asynchronous computations and creates a single asynchronous computation that starts the individual computations in parallel and waits for their completion to be processed as a whole. When all operations complete, the function returns the results aggregated in a single array. Now you can iterate over the array to retrieve the results for further processing. Notice the minimal code change and syntax required to convert a computation that executes one operation at a time into one that runs in parallel. Additionally, this conversion is achieved without the need to coordinate synchronization and memory locks. The `Async.tap` operator applies a function asynchronously to a value passed as input, ignores the result and then returns the original value. The `Tap` operator is introduced in Listing 8.3 Here is its implementation using the F# `Async` workflow (in bold): ``` let inline tap (fn:'a -> 'b) (x:Async<'a>) = **(Async.map fn x) |> Async.Ignore |> Async.Start; x** ``` You can find this and other useful `Async` functions in the source code of the book in the FunctionalConcurrencyLib library. The execution time to download the images in parallel using F# asynchronous workflow in combination with `Async.Parallel` is 10.958 seconds. The result is ~5 seconds faster than APM, which makes it ~8× faster than the original synchronous implementation. The major gains here include code structure, readability, maintainability, and compositionality. Using an asynchronous workflow, you gained a simple asynchronous semantic to run a non-blocking computation, which provides clear code to understand, maintain, and update. Moreover, thanks to the `Async.Parallel` function, multiple asynchronous computations can easily be spawned in parallel with minimum code changes to dramatically improve performance. *Ultimately, the implementation of the `Async.StartCancelable` type extension starts an asynchronous workflow, without blocking the thread caller, using a new `CancellationToken`, and returns `IDisposable` that cancels the workflow when disposed. You haven’t used `Async.Start` because it doesn’t provide a continuation-passing semantic, which is useful in many cases to apply the operation to the result of the computation. In the example, you print a message when the computation completes; but the result type is accessible for further processing. Here’s the implementation of the more sophisticated `Async.StartCancelable` operator compared to `Async.Start` (in bold): ``` type Microsoft.FSharp.Control.Async with static member **StartCancelable**(op:Async<'a>) (tap:'a -> unit)(?onCancel)= let ct = new System.Threading.CancellationTokenSource() let onCancel = defaultArg onCancel ignore Async.**StartWithContinuations**(op, tap, ignore, onCancel, ct.Token) { new IDisposable with member x.Dispose() = ct.Cancel() } ``` The underlying implementation of the `Async.StartCancelable` function uses the `Async.StartWithContinuations` operator, which provides built-in support for cancellation behavior. When the asynchronous operation `op:Async<'a>` is passed (as the first argument completes), the result is passed as a continuation into the second argument function `tap:'a -> unit`. The optional parameter `onCancel` represents the function that’s triggered; in this case, the main operation `op:Async<'a>` is canceled. The result of `Async.StartCancelable` is an anonymous object created dynamically based on the `IDisposable` interface, which will cancel the operation if the `Dispose` method is called. The previously utilized F# Async operators `Async.StartWithContinuations`, `Async.Ignore`, and `Async.Start` may require a bit more explanation. #### Async.StartWithContinuations `Async.StartWithContinuations` executes an asynchronous workflow starting immediately on the current OS thread, and after its completion passes respectively the result, exception, and cancel (`OperationCancelledException)` to one of specified functions. If the thread that initiates the execution has its own `SynchronizationContext` associated with it, then final continuations will use this `SynchronizationContext` for posting results. This function is a good candidate for updating GUIs. It accepts as arguments three functions to invoke when the asynchronous computation completes successfully, or raises an exception, or is canceled. Its signature is `Async<'T> ->('T -> unit)*(exn -> unit)*(OperationCanceledException -> unit) -> unit`. `Async.StartWithContinuations` doesn’t support a return value because the result of the computation is handled internally by the function targeting the successful output. Listing 9.7 `Async.StartWithContinuations` ``` let computation() = async { use client = new WebClient() let! manningSite = client.AsyncDownloadString(Uri("http://www.manning.com")) return manningSite ① } Async.StartWithContinuations(computation(), ② (fun site-> printfn "Size %d" site.Length), ③ (fun exn->printfn"exception-%s"<|exn.ToString()), ④ (fun exn->printfn"cancell-%s"<|exn.ToString())) ⑤ ``` #### Async.Ignore The `Async.Ignore` operator takes a computation and returns a workflow that executes source computation, ignores its result, and returns `unit`. Its signature is `Async.Ignore:` `Async<`'`T> -> Async<unit>.` These are two possible approaches that use `Async.Ignore`: ``` Async.Start(Async.Ignore computationWithResult()) let asyncIgnore = Async.Ignore >> Async.Start ``` The second option creates a function `asyncIgnore`, using function composition to combine the `Async.Ignore` and `Async.Start` operators. The next listing shows the complete example, where the result of the asynchronous operation is ignored using the `asyncIgnore` function (in bold). Listing 9.8 `Async.Ignore` ``` let computation() = async { use client = new WebClient() let! manningSite = client.AsyncDownloadString(Uri("http://www.manning.com")) printfn "Size %d" manningSite.Length **return manningSite** ① } **Async.Ignore (computation())** ② ``` If you need to evaluate the result of an asynchronous operations without blocking, in a pure CPS style, the operator `Async.StartWithContinuations` offers a better approach. #### Async.Start The `Async.Start` function in Listing 9.9 doesn’t support a return value; in fact, its asynchronous computation is type `Async<unit>`. The operator `Async.Start` executes computations asynchronously so the computation process should itself define ways for communication and returning the final result. This function queues an asynchronous workflow for execution in the thread pool and returns control immediately to the caller without waiting to complete. Because of this, the operation can be completed on another thread. Its signature is `Async.Start:` `Async<unit> -> unit`. As optional arguments, this function takes a `cancellationToken.` Listing 9.9 `Async.Start` ``` let computationUnit() = async { ① do! Async.Sleep 1000 use client = new WebClient() let! manningSite = client.AsyncDownloadString(Uri("http://www.manning.com")) printfn "Size %d" manningSite.Length ② } Async.Start(computationUnit()) ③ ``` Because `Async.Start` doesn’t support a return value, the size of the website is printed inside the expression, where the value is accessible. What if the computation does return a value, and you cannot modify the asynchronous workflow? It’s possible to discharge the result from an asynchronous computation using the `Async.Ignore` function before starting the operation. ### 9.3.6 Asynchronous workflow cancellation support When executing an asynchronous operation, it’s useful to terminate execution prematurely, before it completes, on demand. This works well for long-running, non-blocking operations, where making them cancelable is the appropriate practice to avoid tasks that can hang. For example, you may want to cancel the operation of downloading 100 images from Azure Blob storage if the download exceeds a certain period of time. The F# asynchronous workflow supports cancellation natively as an automatic mechanism, and when a workflow is canceled, it also cancels all the child computations. Most of the time you’ll want to coordinate cancellation tokens and maintain control over them. In these cases, you can supply your own tokens, but in many other cases, you can achieve similar results with less code by using the built-in F# asynchronous module default token. When the asynchronous operation begins, this underlying system passes a provided `CancellationToken`, or assigns an arbitrary one if not provided, to the workflow, and it keeps track of whether a cancellation request is received. The computation builder, `AsyncBuilder`*,* checks the status of the cancellation token during each binding construct (`let!`, `do!`, `return!`, `use!`). If the token is marked “canceled” the workflow terminates. This is a sophisticated mechanism that eases your work when you don’t need to do anything complex to support cancellation. Moreover, the F# asynchronous workflow supports an implicit generation and propagation of cancellation tokens through its execution, and any nested asynchronous operations are included automatically in the cancellation hierarchy during asynchronous computations. F# supports cancellation in different forms. The first is through the function `Async.StartWithContinuations`, which observes the default token and cancels the workflow when the token is set as canceled. When the cancellation token triggers, the function to handle the cancellation token is called in place of the success one. The other options include passing a cancellation token manually or relying on the default `Async.DefaultCancellationToken` to trigger `Async.CancellationToken` (in bold in Listing 9.10). Listing 9.10 shows how to introduce support for cancellation in the previous `Async.Parallel` image download (Listing 9.6). In this example, the cancellation token is passed manually, because in the automatic version using the `Async.DefaultCancellationToken,` there’s no code change, only the function to cancel the last asynchronous operation. Listing 9.10 Canceling an asynchronous computation ``` let tokenSource = new **CancellationTokenSource**() ① let container = getCloudBlobContainer() let parallelComp() = container.ListBlobs() |> Seq.map(fun blob -> downloadMediaCompAsync container blob) |> Async.Parallel Async.Start(parallelComp() |> Async.Ignore, **tokenSource.Token**) ① **tokenSource.Cancel()** ② ``` You created an instance of a `CancellationTokenSource` that passes a cancellation token to the asynchronous computation, starting the operation with the `Async.Start` function and passing `CancellationToken` as the second argument. Then you cancel the operation, which terminates all nested operations. In Listing 9.11, `Async.TryCancelled` appends a function to an*asynchronous workflow*.* It’s this function that will be invoked when the cancellation token is marked. This is an alternative way to inject extra code to run in case of cancellation. The following listing shows how to use the `Async.TryCancelled` function, which also has the advantage of returning a value, providing compositionality. (The code to note is in bold.)* *Listing 9.11 Canceling an asynchronous computation with notification ``` let onCancelled = fun (cnl:**OperationCanceledException**) -> ① printfn "Operation cancelled!" let tokenSource = new **CancellationTokenSource**() let tryCancel = Async.**TryCancelled**(parallelComp(), onCancelled) ② Async.Start(tryCancel, **tokenSource**.**Token**) ``` `TryCancelled` is an asynchronous workflow that can be combined with other computations. Its execution begins on demand with an explicit request, using a starting function such as `Async.Start` or `Async.RunSynchronously.` #### Async.RunSynchronously The `Async.RunSynchronously` function blocks the current thread during the workflow execution and continues with the current thread when the workflow completes. This approach is ideal to use in an F# interactive session for testing and in console applications, because it waits for the asynchronous computation to complete. It’s not the recommended way to run an asynchronous computation in a GUI program, however, because it will block the UI. Its signature is `Async<`'`T> ->` '`T`. As optional arguments, this function takes a timeout value and a `cancellationToken.` The following listing shows the simplest way to execute an asynchronous workflow (in bold). Listing 9.12 `Async.RunSynchronously` ``` let computation() = async { ① do! Async.Sleep 1000 ② use client = new WebClient() return! client.AsyncDownloadString(Uri("www.manning.com")) ③ } let manningSite = Async.**RunSynchronously**(computation()) ④ printfn "Size %d" manningSite.Length ⑤ ``` ### 9.3.7 Taming parallel asynchronous operations The `Async.Parallel` programing model is a great feature for enabling I/O parallelism based on the Fork/Join pattern. Fork/Join allows you to execute a series of computation, such that execution branches off in parallel at designated points in the code, to merge at a subsequent point resuming the execution. But because `Async.Parallel` relies on the thread pool, the maximum degree of parallelism is guaranteed, and, consequently, performance increases. Also, cases exist where starting a large number of asynchronous workflows can negatively impact performance. Specifically, an asynchronous workflow is executed in a semi-preemptive manner, where after many operations (more than 10,000 in a 4 GB RAM computer) begin execution, asynchronous workflows are enqueued, and even if they aren’t blocking or waiting for a long-running operation, another workflow is dequeued for execution. This is an edge case that can damage the parallel performance, because the memory consumption of the program is proportional to the number of ready-to-run workflows, which can be much larger than the number of CPU cores. Another case to pay attention to is when asynchronous operations that can run in parallel are constraints by external factors. For example, running a console application that performs web requests, the default maximum number of concurrent HTTP connections allowed by a `ServicePoint`^(4) object is two. In the particular example of Azure Blob storage, you link the `Async.Parallel` to execute multiple long-running operations in parallel, but ultimately, without changing the base configuration, there will be only a limited two parallel web requests. For maximizing the performance of your code, it’s recommended you tame the parallelism of the program by throttling the number of concurrent computations. The following code listing shows the implementation of two functions `ParallelWithThrottle` and `ParallelWithCatchThrottle`, which can be used to refine the number of running concurrent asynchronous operations. Listing 9.13 `ParallelWithThrottle` and `ParallelWithCatchThrottle` ``` type Result<'a> = Result<'a, exn> ① module Result = let ofChoice value = ② match value with | Choice1Of2 value -> Ok value | Choice2Of2 e -> Error e module Async = let parallelWithCatchThrottle (selector:Result<'a> -> 'b) ③ (throttle:int) ④ (computations:seq<Async<'a>>) = async { ⑤ use semaphore = new SemaphoreSlim(throttle) ⑥ let throttleAsync (operation:Async<'a>) = async { ⑦ try do! semaphore.WaitAsync() let! result = Async.Catch operation ⑧ return selector (result |> Result.ofChoice) ⑨ finally semaphore.Release() |> ignore } ⑩ return! computations |> Seq.map throttleAsync |> Async.Parallel } let parallelWithThrottle throttle computations = parallelWithCatchThrottle id throttle computations ``` The function `parallelWithCatchThrottle` creates an asynchronous computation that executes all the given asynchronous operations, initially queuing each as work items and using a Fork/Join pattern. The parallelism is throttled, so that the most throttle computations run at one time. In Listing 9.13, the function `Async.Catch` is exploited to protect a parallel asynchronous computation from failure. The function `parallelWithCatchThrottle` doesn’t throw exceptions, but instead returns an array of F# `Result` types. The second function, `parallelWithThrottle`, is a variant of the former function that uses `id` in place of the `selector` argument. The `id` function in F# is called an *identity function*, which is a shortcut for an operation that returns itself: `(fun x -> x)`. In the example, `id` is used to bypass the `selector` and return the result of the operation without applying any transformation. The release of F# 4.1 introduced the `Result<'TSuccess, 'TError>` type, a convenient DU that supports consuming code that could generate an error without having to implement exception handling. The `Result` DU is typically used to represent and preserve an error that can occur during execution. The first line of code in the previous listing defined a `Result<'a>` type alias over the `Result<'a, exn>`, which assumes that the second case is always an exception (`exn`). This `Result<'a>` type alias aims to simplify the pattern matching over the `result`: ``` let! **result** = Async.Catch operation ``` You can handle exceptions in F# asynchronous operations in different ways. The most idiomatic is to use `Async.Catch` as a wrapper that safeguards a computation by intercepting all the exceptions within the source computation. `Async.Catch` takes a more functional approach because, instead of having a function as an argument to handle an error, it returns a discriminated union of `Choice<'a, exn>`, where `'a` is the result type of the asynchronous workflow, and `exn` is the exception thrown. The underlying values of the result `Choice<'a, exn>` can be extracted with pattern matching. I cover error handling in functional programming in chapter 10. `Choice<'T, exn>` is a DU^(5) with two union cases: * `Choice1Of2 of 'T` contains the result for successful workflow completion. * `Choice2Of2 of exn` represents the workflow failure and contains the thrown exception. Handling exceptions with this functional design lets you construct the asynchronous code in a compositional and natural pipeline structure. `Choice<'T, 'U>` is a DU built into the F# core, which is helpful; but in this case, you can create a better representation of the asynchronous computation result by replacing the DU `Choice` with the meaningful `DU Result<'a>`.^(6) (The code to note is in bold.) Listing 9.14 `ParallelWithThrottle` with Azure Table Storage downloads ``` let maxConcurrentOperations = 100 ① ServicePointManager.DefaultConnectionLimit <- maxConcurrentOperations ② let downloadMediaCompAsyncParallelThrottle() = async { let! container = getCloudBlobContainerAsync() let computations = container.ListBlobs() ③ |> Seq.map(fun blobMedia -> transformAndSaveImage container blobMedia) return! Async.**parallelWithThrottle** ④ maxConcurrentOperations computations } ``` The code sets the limit of the concurrent request `maxConcurrentOperations` to 100 using `ServicePointManager.DefaultConnectionLimit`. The same value is passed as an argument to `parallelWithThrottle` to throttle the concurrent requests. `maxConcurrentOperations` is an arbitrary number that can be large, but I recommend that you test and measure the execution time and memory consumption of your program to detect which value has the best performance impact. ## Summary * With asynchronous programming, you can download multiple images in parallel, removing hardware dependencies and releasing unbounded computational power. * The FP language F# provides full support for asynchronous programming integrating within the asynchronous programming model provided by .NET. It also offers an idiomatic functional implementation of the APM called asynchronous workflow, which can interop the task-based programming model in C#. * The F# asynchronous workflow is based on the `Async<'a>` type, which defines a computation that will complete sometime in the future. This provides great compositionality properties because it doesn’t start immediately. Asynchronous computation requires an explicit request to start. * The time to perform multiple synchronous I/O operations sequentially is equal to the sum of the time elapsed for each individual operation, in comparison to the asynchronous approach, which runs in parallel, so the overall response time is equal to the slowest operation. * Using a continuation passing style, which embraces the functional paradigm, your code becomes remarkably concise and easy to write as multithreaded code. * The F# computation expression, specifically in the form of an asynchronous workflow, performs and chains a series of computations asynchronously without blocking the execution of other work. * Computation expressions can be extended to operate with different elevated types without the need to leave the current context, or you can create your own to extend the compiler’s capabilities. * It’s possible to build tailored asynchronous combinators to handle special cases.****** ```` ``````****# 10 Functional combinators for fluent concurrent programming **This chapter covers** * Handling exceptions in a functional style * Using built-in `Task` combinators * Implementing custom asynchronous combinators and conditional operators * Running parallel asynchronous heterogeneous computations In the previous two chapters, you learned how to apply asynchronous programming to develop scalable and performant systems. You applied functional techniques to compose, control, and optimize the execution of multiple tasks in parallel. This chapter further raises the level of abstraction for expressing asynchronous computations in a functional style. We’ll start by looking at how to manage exceptions in a functional style, with a focus on asynchronous operations. Next, we’ll explore *functional combinators*, a useful programming tool for building a set of utility functions that allow you to create complex functions by composing smaller and more concise operators. These combinators and techniques make your code more maintainable and performant, improving your ability to write concurrent computations and handle side effects. Toward the end of this chapter, we’ll go through how to interop between C# and F# by calling and passing asynchronous functions from one to the other. Of all the chapters in this book, this one is the most complex, because it covers FP theory where the lexicon might appear as jargon initially. With great effort, comes great reward . . . . The concepts explained in this chapter will provide exceptional tools for building sophisticated concurrent programs simply and easily. It’s not necessary for the average programmer to know exactly how the .NET garbage collector (GC) works, because it operates in the background. But the developer who understands the operational details of the GC can maximize a program’s memory use and performance. Throughout this chapter, we revisit the examples from chapter 9, with slightly more complex variations. The code examples are in C# or F#, using the programming language that best resonates with the idea in context. But all the concepts apply to both programming languages, and in most cases you’ll find the alternate code example in the source code. This chapter can help you to understand the compositional semantics of functional error handling and functional combinators so you can write efficient programs for processing concurrent (and parallel) asynchronous operations safely, with minimum effort and high-yield performance. By the end of this chapter, you’ll see how to use built-in asynchronous combinators and how to design and implement efficient custom combinators that perfectly meet your applications’ requirements. You can raise the level of abstraction in complex and slow-running parts of the code to effortlessly simplify the design, control flow, and reduce the execution time. ## 10.1 The execution flow isn’t always on the happy path: error handling Many unexpected issues can arise in software development. Enterprise applications, in general, are distributed and depend on a number of external systems, which can lead to a multitude of problems. Examples of these problems are: * Losing network connectivity during a web request * Applications that fail to communicate with the server * Data that becomes inadvertently `null` while processing * Thrown exceptions As developers, our goal is to write robust code that accounts for these issues. But addressing potential issues can itself create complexity. In real-world applications, the execution flow isn’t always on the “happy path” where the default behavior is error-free (figure 10.1). To prevent exceptions and to ease the debugging process, you must deal with validation logic, value checking, logging, and convoluted code. In general, computer programmers tend to overuse and even abuse exceptions. For example, in code it’s common for an exception to be thrown; and, absent the handler in that context, the caller of this piece of code is forced to handle that exception several levels up the call stack.  Figure 10.1 The user sends an update request, which can easily stray from the happy path. In general, you write code thinking that nothing can go wrong. But producing quality code must account for exceptions or possible issues such as validation, failure, or errors that prevent the code from running correctly. In asynchronous programming, error handling is important to guarantee the safe execution of your application. It’s assumed that an asynchronous operation will complete, but what if something goes wrong and the operation never terminates? Functional and imperative paradigms approach error handling with different styles: * The imperative programming approach to handling errors is based on side effects. Imperative languages use the introduction of `try-catch` blocks and `throw` statements to generate side effects. These side effects disrupt the normal program flow, which can be hard to reason about. When using the traditional imperative programming style, the most common approach to handling an error is to guard the method from raising an error and return a `null` value if the payload is empty. This concept of error processing is widely used, but handling errors this way within the imperative languages isn’t a good fit because it introduces more opportunities for bugs. * The FP approach focuses on minimizing and controlling side effects, so error handling is generally done while avoiding mutation of state and without throwing exceptions. If an operation fails, for example, it should return a structural representation of the output that includes the notification of success or failure. ### 10.1.1 The problem of error handling in imperative programming In the .NET Framework, it’s easy to capture and react to errors in an asynchronous operation. One way is to wrap all the code that belongs to the same asynchronous computation into a `try-catch` block. To illustrate the error-handling problem and how it can be addressed in a functional style, let’s revisit the example of downloading images from Azure Blob storage (covered in chapter 9). Listing 10.1 shows how it makes the method `DownloadImageAsync` safe from exceptions that could be raised during its execution (in bold). Listing 10.1 `DownloadImageAsync` with traditional imperative error handling ``` static async Task<Image> DownloadImageAsync(string blobReference) { try { var container = await Helpers.GetCloudBlobContainerAsync(). ➥ ConfigureAwait(false); ① CloudBlockBlob blockBlob = container.➥ GetBlockBlobReference(blobReference); ① using (var memStream = new MemoryStream()) { await blockBlob.DownloadToStreamAsync(memStream).ConfigureAwait(false); ① return Bitmap.FromStream(memStream); } } **catch** (**StorageException** ex) { Log.Error("Azure Storage error", ex); ② **throw**; } **catch** (**Exception** ex) { Log.Error("Some general error", ex); ② **throw**; } } async RunDownloadImageAsync() ③ { try { var image = await DownloadImageAsync("Bugghina0001.jpg"); ProcessImage(image); } **catch** (**Exception** ex) { HanldlingError(ex); ② **throw**; } } ``` It seems easy and straightforward: first `DownloadImageAsync` is called by the caller `RunDownloadImageAsync`, and the image returned is processed. This code example already assumes that something could go wrong and wraps the core execution into a `try-catch` block. Banking on the happy path—that’s the path in which everything goes right—is a luxury that a programmer cannot afford for building robust applications. As you can see, when you start accounting for potential failures, input errors, and logging routine, the method starts turning into lengthy boilerplate code. If you remove the error-handling lines of code, there are only 9 lines of meaningful core functionality, compared with 21 of boilerplate orchestration dedicated to error and log handling alone. A nonlinear program flow like this can quickly become messy because it’s hard to trace all existing connections between `throw` and `catch` statements. Furthermore, with exceptions it’s unclear exactly where the errors are being caught. It’s possible to wrap up the validation routine with a `try-catch` statement right when it’s called, or the `try-catch` block can be inserted a couple levels higher. It becomes difficult to know if the error is thrown intentionally. In Listing 10.1, the body of the method `DownloadImageAsync` is wrapped inside a `try-catch` block to safeguard the program in case an exception occurs. But in this case, there’s no error handling applied; the exception is rethrown and a log with the error details is applied. The purpose of the `try-catch` block is to prevent an exception by surrounding a piece of code that could be unsafe; but if an exception is thrown, the runtime creates a stack trace of all function calls leading up to the instruction that generated the error. `DownloadImageAsync` is executed, but what kind of precaution should be used to ensure that potential errors are handled? Should the caller be wrapped up into a `try-catch` block, too, as a precaution? ``` Image image = await DownloadImageAsync("Bugghina001.jpg"); ``` In general, the function caller is responsible for protecting the code by checking state of the objects for validity before use. What would happen if the check of state is missing? Easy answer: more problems and bugs appear. In addition, the complexity of the program increases when the same `DownloadImageAsync` appears in multiple places throughout the code, because each caller could require different error handling, leading to leaks and domain models with unnecessary complexity. In chapter 8, we defined two extension methods for the `Task` type, `Retry` and `Otherwise` (fallback*),* for asynchronous `Task` operations that apply logic in case of an exception. Fortunately, because asynchronous operations have external factors that make them vulnerable to exceptions, the .NET `Task` type has built-in error handling via the `Status` and `Exception` properties, as shown here (`Retry` and `Otherwise` are in bold for reference). Listing 10.2 Refreshing the `Otherwise` and `Retry` functions ``` static async Task<T> **Otherwise**<T>(this Task<T> task, ➥ Func<Task<T>> orTask) => ① task.ContinueWith(async innerTask => { if (innerTask.Status == TaskStatus.Faulted) ➥ return await orTask(); return await Task.FromResult<T>(innerTask.Result); }).Unwrap(); static async Task<T> **Retry**<T>(Func<Task<T>> task, int retries, TimeSpan ➥ delay, CancellationToken cts = default(CancellationToken)) ② => await task().ContinueWith(async innerTask => { cts.ThrowIfCancellationRequested(); if (innerTask.Status != TaskStatus.Faulted) return innerTask.Result; if (retries == 0) throw innerTask.Exception ?? throw new Exception(); await Task.Delay(delay, cts); return await Retry(task, retries - 1, delay, cts); }).Unwrap(); ``` It’s good practice to use the functions `Retry` and `Otherwise` to manage errors in your code. For example, you can rewrite the call of the method `DownloadImageAsync` using the helper functions: ``` Image image = await AsyncEx.**Retry**(**async** () => **await** DownloadImageAsync("Bugghina001.jpg") **.Otherwise**(**async** () => **await** DownloadImageAsync("Bugghina002.jpg")), 5, TimeSpan.FromSeconds(2)); ``` By applying the functions `Retry` and `Otherwise` in the previous code, the function `DownloadImageAsync` changes behavior and becomes safer to run. If something goes wrong when `DownloadImageAsync` is retrieving the image `Bugghina001`, its operation fallback is to download an alternative image. The `Retry` logic, which includes the `Otherwise` (fallback) behavior, is repeated up to five times with a delay of two seconds between each operation, until it’s successful (figure 10.2).  Figure 10.2 The client sends two requests to the server to apply the strategies of `Otherwise` (fallback) and `Retry` in case of failure. These requests (`DownloadImageAsync`) are safe to run because they both apply the `Retry` and `Otherwise` strategies to handle problems that may occur. Additionally, you can define a further extension method such as the `Task.Catch` function, tailored specifically to handle exceptions generated during asynchronous operations. Listing 10.3 `Task.Catch` function ``` static Task<T> Catch<T, TError>(this Task<T> task, ➥ Func<TError, T> onError) where TError : Exception { var tcs = new TaskCompletionSource<T>(); ① task.ContinueWith(innerTask => { if (innerTask.IsFaulted && innerTask?.Exception?.InnerException ➥ is TError) tcs.SetResult(onError((TError)innerTask.Exception. ➥ InnerException)); ② else if (innerTask.IsCanceled) tcs.SetCanceled(); ② else if (innerTask.IsFaulted) tcs.SetException(innerTask?.Exception?.InnerException ?? ➥ throw new InvalidOperationException()); ② else tcs.SetResult(innerTask.Result); ② }); return tcs.Task; } ``` The function `Task.Catch` has the advantage of expressing specific exception cases as type constructors. The following snippet shows an example of handling `StorageException` in the Azure Blob storage context (in bold): ``` static Task<Image> **CatchStorageException**(this Task<Image> task) => task.**Catch**<Image, StorageException>(ex => Log($"Azure Blob ➥ Storage Error {ex.Message}")); ``` The `CatchStorageException` extension method can be applied as shown in this code snippet: ``` Image image = await DownloadImageAsync("Bugghina001.jpg").**CatchStorageException**(); ``` Yes, this design could violate the principle of nonlocality, because the code used to recover from the exception is different from the originating function call. In addition, there’s no support from the compiler to notify the developer that the caller of the `DownloadImageAsync` method is enforcing error handling, because its return type is a regular `Task` primitive type, which doesn’t require and convey validation. In this last case, when the error handling is omitted or forgotten, an exception could potentially arise, causing unanticipated side effects that might impact the entire system (beyond the function call), leading to disastrous consequences, such as crashing the application. As you can see, exceptions ruin the ability to reason about the code. Furthermore, the structured mechanism of throwing and catching exceptions in imperative programming has drawbacks that are against functional design principles. As one example, functions that throw exceptions can’t be composed or chained the way other functional artifacts can. Generally, code is read more often than written, so it makes sense that best practices are aimed at simplifying understanding and reasoning about the code. The simpler the code, the fewer bugs it contains, and the easier it is to maintain the software overall. The use of exceptions for program flow control hides the programmer’s intention, which is why it’s considered a bad practice. Thankfully, you can avoid complex and cluttered code relatively easily. The solution is to explicitly return values indicating success or failure of an operation instead of throwing exceptions. This brings clarity to potentially error-prone code parts. In the following sections, I show two possible approaches that embrace the functional paradigm to ease the error-handling semantic structure. ### 10.2.1 Error handling in FP: exceptions for flow control Let’s revisit the `DownloadImageAsync` method, but this time handling the error in a functional style. First, look at the code example, followed by the details in the following listing. The new method `DownloadOptionImage` catches the exception in a `try-catch` block as in the previous version of the code, but here the result is the `Option` type (in bold). Listing 10.4 `Option` type for error handling in a functional style ``` async **Task<Option<Image>>** DownloadOptionImage(string blobReference) ① { try { var container = await Helpers.GetCloudBlobContainerAsync(). ➥ConfigureAwait(false); CloudBlockBlob blockBlob = container. ➥GetBlockBlobReference(blobReference); using (var memStream = new MemoryStream()) { await ➥ blockBlob.DownloadToStreamAsync(memStream).ConfigureAwait(false); return **Option.Some**(Bitmap.FromStream(memStream)); ② } } catch (Exception) { return **Option.None**; ② } } ``` The `Option` type notifies the function caller that the operation `DownloadOptionImage` has a particular output, which must be specifically managed. In fact, the `Option` type can have as a result either `Some` or `None`. Consequently, the caller of the function `DownloadOptionImage` is forced to check the result for a value. If it contains the value, then this is a success, but if it doesn’t, then it’s a failure. This validation requires the programmer to write code to handle both possible outcomes. Use of this design makes the code predictable, avoids side effects, and permits `DownloadOptionImage` to be composable. #### Controlling side effects with the option type In FP, the notion of `null` values doesn’t exist. Functional languages such as Haskell, Scala, and F# resolve this problem by wrapping the `null`able values in an `Option` type. In F#, the `Option` type is the solution to the null-pointer exception; it’s a two-state discriminated union (DU), which is used to wrap a value (`Some`), or no value (`None`). Consider it a box that might contain something or could be empty. Conceptually, you can think the `Option` type as something that’s either present or absent. The symbolic definition for `Option` type is ``` type Option<'T> = | Some of value:T | None ``` The `Some` case means that data is stored in the associated inner value `T`. The `None` case means there’s no data. `Option<Image>`, for example, may or may not contain an image. Figure 10.3 shows the comparison between a nullable primitive type and the equivalent `Option` type.  Figure 10.3 This illustrates the comparison between regular `null`able primitives (first row) and `Option` types (second row). The main difference is that the regular primitive type can be either a valid or invalid (`null`) value without informing the caller, whereas the `Option` type wraps a primitive type suggesting to the caller to check if the underlying value is valid. An instance of the `Option` type is created by calling either `Some(value)`, which represents a positive response, or `None`, which is the equivalent of returning an empty value. With F#, you don’t need to define the `Option` type yourself. It’s part of the standard F# library, and there is a rich set of helper functions that go with it. C# has the `Nullable<T>` type, which is limited to value types. The initial solution is to create a generic `struct` that wraps a value. Using a value type (`struct`) is important for reducing the memory allocation and is ideal for avoiding `null` reference exceptions by assigning a `null` value to an `Option` type itself. To make the `Option` type reusable, we use a generic C# `struct Option<T>,` which wraps any arbitrary type that may or may not contain a value. The basic structure of `Option<T>` has a property `value` of type `T` and a flag `HasValue` that indicates whether the value is set. The implementation of the `Option` type in C# is straightforward, and isn’t illustrated here. You can check the source code of this book if you’re interested in understanding the C# `Option` type implementation. The higher level of abstraction achieved using the `Option<T>` type allows the implementation of higher-order functions (HOFs), such as `Match` and `Map`, which simplifies the compositional structure of the code, and in this case, with the function `Match`, allows pattern matching and a deconstructive semantic: ``` R Match<R>(Func<R> none, Func<T, R> some) => hasValue ? some(value) : none(); ``` The `Match` function belongs to the `Option` type instance, which offers a convenient construct by eliminating unnecessary casts and improving code readability. ### 10.2.2 Handling errors with Task<Option<T>> in C# In Listing 10.4, I illustrated how the `Option` type protects the code from bugs, making the program safer from `null`-pointer exceptions, and suggested that the compiler helps to avoid accidental mistakes. Unlike `null` values, an `Option` type forces the developer to write logic to check if a value is present, thereby mitigating many of the problems of `null` and `error` values. Back to the Azure Blob storage example, with the `Option` type and the `Match` HOF in place, you can execute the `DownloadOptionImage` function, whose return type is a `Task<Option<Image>>` : ``` **Option<Image>** imageOpt = await DownloadOptionImage ("Bugghina001.jpg"); ``` By using the compositional nature of the `Task` and `Option` types and their extended HOFs, the FP style (in bold) looks like this code snippet: ``` DownloadOptionImage ("Bugghina001.jpg") .**Map**(opt => opt.**Match**( **some**: image => image.Save("ImageFolder\Bugghina.jpg"), **none**: () => Log("There was a problem downloading **➥** the image"))); ``` This final code is fluent and expressive, and, more importantly, it reduces bugs because the compiler forces the caller to cover both possible outcomes: success and failure. ### 10.2.3 The F# AsyncOption type: combining Async and Option The same approach of handling exceptions using the `Task<Option<T>>` type is applicable to F#. The same technique can be exploited in the F# asynchronous workflow for a more idiomatic approach. The improvement that F# achieves, as compared to C#, is support for *type aliases*, also called *type abbreviations*. A type alias is used to avoid writing a signature repeatedly, simplifying the code experience. Here’s the type alias for `Async<Option<’T>>`: ``` type **AsyncOption**<'T> = Async<Option<'T>> ``` You can use this `AsyncOption<'T>` definition directly in the code in place of `Async<Option<'T>>` for the same behavior. Another purpose of the type alias is to provide a degree of decoupling between the use of a type and the implementation of a type. This listing shows the equivalent F# implementation of the `DownloadOptionImage` previously implemented in C#. Listing 10.5 F# implementation of the `AsyncOption` type alias in action ``` let downloadOptionImage(blobReference:string) : **AsyncOption**<Image> = ① async { try ② let! container = Helpers.getCloudBlobContainerAsync() let blockBlob = container.GetBlockBlobReference(blobReference) use memStream = new MemoryStream() do! blockBlob.DownloadToStreamAsync(memStream) return **Some**(Bitmap.FromStream(memStream)) ③ with ② | _ -> return **None** ③ } downloadOptionImage "Bugghina001.jpg" |> **Async.map**(fun imageOpt -> ④ **match** imageOpt **with** ⑤ | **Some**(image) -> do! image.SaveAsync("ImageFolder\Bugghina.jpg") ⑥ | **None** -> log "There was a problem downloading the image") ``` The function `downloadOptionImage` asynchronously downloads an image from Azure Blob storage. The `Async.map` function, with signature `('a -> 'b) -> Async<'a> -> Async<'b>`, wraps the output of the function and allows access to the underlying value. In this case, the generic type `'a` is an `Option<Image>`. *Conveniently, the functions that belong to the F# `Async` module can be applied to the alias `AsyncOption`, because it’s an `Async` type that wraps an `Option`. The function inside the `Async.map` operator extracts the `Option` value, which is pattern matched to select the behavior to run according to whether it has the value `Some` or `None`. ### 10.2.4 Idiomatic F# functional asynchronous error handling At this point, the F# `downloadOptionImage` function is safely downloading an image, ensuring that it will catch the exception if a problem occurs without jeopardizing the application’s stability. But the presence of the `try-with` block, equivalent to `try-catch` in C#, should be avoided when possible, because it encourages an impure (with side effects) programming style. In the context of asynchronous computation, the F# `Async` module provides an idiomatic and functional approach by using the `Async.Catch` function as a wrapper that protects a computation. You can use `Async.Catch` to safely run and map asynchronous operations into a `Choice<'a, exn>` type. To reduce the amount of boilerplate required, and generally simplify your code, you can create a helper function that wraps an `Async<'T>` and returns an `AsyncOption<'T>` by using the `Async.Catch` operator. The following code snippet shows the implementation. The helper function `ofChoice` is supplementary to the F# `Option` module, whose purpose it is to map and convert a `Choice` type into an `Option` type: ``` module Option = let **ofChoice** choice = match choice with | **Choice1Of2** value -> **Some** value | **Choice2Of2** _ -> **None** module AsyncOption = let **handler** (operation:Async<'a>) : AsyncOption<'a> = async { let! result = **Async.Catch** operation return (Option.**ofChoice** result) } ``` `Async.Catch` is used for exception handling to convert `Async<'T>` to `Async<Choice<'T, exn>>`. This `Choice` is then converted to an `Option<'T>` using a simple conversion `ofChoice` function. The `AsyncOption` handler function can safely run and map asynchronous `Async<'T>` operations into an `AsyncOption` type. Listing 10.6 shows the `downloadOptionImage` implementation without the need to protect the code with the `try-with` block. The function `AsyncOption.handler` is managing the output, regardless of whether it succeeds or fails. In this case, if an error arises, `Async.Catch` will capture and transform it into an `Option` type through the `Option.ofChoice` function (in bold). Listing 10.6 `AsyncOption` type alias in action ``` let downloadAsyncImage(blobReference:string) : **Async<Image>** = async { let! container = Helpers.getCloudBlobContainerAsync() let blockBlob = container.GetBlockBlobReference(blobReference) use memStream = new MemoryStream() do! blockBlob.DownloadToStreamAsync(memStream) return Bitmap.FromStream(memStream) } downloadAsyncImage "Bugghina001.jpg" |> **AsyncOption.handler** ① |> **Async.map**(fun imageOpt -> ② **match** imageOpt **with** ③ | **Some**(image) -> image.Save("ImageFolder\Bugghina.jpg") | **None** -> log "There was a problem downloading the image") |> Async.Start ``` The function `AsyncOption.handler` is a reusable and composable operator that can be applied to any asynchronous operation. ### 10.2.5 Preserving the exception semantic with the Result type In section 10.2.2, you saw how the functional paradigm uses the `Option` type to handle errors and control side effects. In the context of error handling, `Option` acts as a container, a box where side effects fade and dissolve without creating unwanted behaviors in your program. In FP, the notion of boxing dangerous code, which could throw errors, isn’t limited to the `Option` type. In this section, you’ll preserve the error semantic to use the `Result` type, which allows different behaviors to dispatch and branch in your program based upon the type of error. Let’s say that as part of the implementation of an application, you want to ease the debugging experience or to communicate to the caller of a function the exception details if something goes wrong. In this case, the `Option` type approach doesn’t fit the goal, because it delivers `None` (nothing) as far as information about what went wrong. While it’s unambiguous what a `Some` result means, `None` doesn’t convey any information other than the obvious. By discarding the exception, it’s impossible to diagnose what could have gone wrong. Going back to our example of downloading an image from Azure Blob storage, if something goes wrong during the retrieval of the data, there are diverse errors generated from different cases, such as the loss of network connectivity and file/image not found. In any event, you need to know the error details to correctly apply a strategy to recover from an exception. In this listing, the `DownloadOptionImage` method from the previous example retrieves an image from the Azure Blob storage. The `Option` type (in bold) is exploited to handle the output in a safer manner, managing the event of errors. Listing 10.7 `Option` type, which doesn’t preserve error details ``` async **Task<Option<Image>>** DownloadOptionImage(string blobReference) { try { CloudStorageAccount storageAccount = ➥ CloudStorageAccount.Parse("<Azure Connection>"); CloudBlobClient blobClient = ➥ storageAccount.CreateCloudBlobClient(); CloudBlobContainer container = ➥ blobClient.GetContainerReference("Media"); await container.CreateIfNotExistsAsync(); CloudBlockBlob blockBlob = container. ➥ GetBlockBlobReference(blobReference); using (var memStream = new MemoryStream()) { await blockBlob.DownloadToStreamAsync(memStream). ➥ ConfigureAwait(false); return **Some**(Bitmap.FromStream(memStream)); } } catch (StorageException) { return **None**; ① } catch (Exception) { return **None**; ① } } ``` Regardless of the exception type raised, either a `StorageException` or a generic `Exception`, the limitation with the code implementation is that the caller of the method `DownloadOptionImage` doesn’t have any information regarding the exception, so a tailored recover strategy cannot be chosen. Is there a better way? How can the method provide details of a potential error and avoid side effects? The solution is to use the polymorphic `Result<'TSuccess, 'TError>` type in place of the `Option<'T>` type. `Result<'TSuccess, 'TError>` can be used to handle errors in a functional style plus carry the cause of the potential failure. Figure 10.4 compares a `null`able primitive, the equivalent `Option` type, and the `Result` type.  Figure 10.4 Comparing a regular `null`able primitive (top row), the `Option` type (second row), and the `Result` type (bottom row). The `Result Failure` is generally used to wrap an error if something goes wrong. In certain programming languages, such as Haskell, the `Result` structure is called `Either`, which represents a logical separation between two values that would never occur at the same time. For example, `Result<int, string>` models two cases and can have either value `int` or `string`. The `Result<'TSuccess, 'TError>` structure can also be used to guard your code against unpredictable errors, which makes the code more type safe and side effect free by eliminating the exception early on instead of propagating it. In Listing 10.8, the C# example of the `Result` type implementation is tailored to be polymorphic in only one type constructor, while forcing the `Exception` type as an alternative value to handle errors. Consequently, the type system is forced to acknowledge error cases and makes the error-handling logic more explicit and predictable. Certain implementation details are omitted in this listing for brevity but provided in full as part of the downloadable source code. Listing 10.8 Generic `Result<T>` type in C# ``` struct Result<T> { public T Ok { get; } ① public Exception Error { get; } ① public bool IsFailed { get => Error != null; } public bool IsOk => !IsFailed; public Result(T ok) ② { Ok = ok; Error = default(Exception); } public Result(Exception error) ② { Error = error; Ok = default(T); } public R Match<R>(Func<T, R> okMap, Func<Exception, R> failureMap) => IsOk ? okMap(Ok) : failureMap(Error); ③ public void Match(Action<T> okAction, Action<Exception> errorAction) { if (IsOk) okAction(Ok); else errorAction(Error);} ③ public static implicit operator Result<T>(T ok) => ④ ➥ new Result<T>(ok); ④ public static implicit operator Result<T>(Exception error) => ④ ➥ new Result<T>(error); ④ public static implicit operator Result<T>(Result.Ok<T> ok) => ④ ➥ new Result<T>(ok.Value); ④ public static implicit operator Result<T>(Result.Failure error) => ④ ➥ new Result<T>(error.Error); ④ } ``` The interesting part of this code is in the final lines where the implicit operators simplify the conversion to `Result` during the assignment of primitives. This auto-construct to `Result` type should be used by any function that potentially returns an error. Here, for example, is a simple synchronous function that loads the bytes of a given file. If the file doesn’t exist, then a `FileNotFoundException` exception is returned: ``` static **Result<byte[]>** ReadFile(string path) { if (File.Exists(path)) return File.ReadAllBytes(path); else return new FileNotFoundException(path); } ``` As you can see, the output of the function `ReadFile` is a `Result<byte[]>`, which wraps either the successful outcome of the function that returns a byte array, or the failure case that returns a `FileNotFoundException` exception. Both return types, `Ok` and `Failure`, are implicitly converted without a type definition. ## 10.3 Taming exceptions in asynchronous operations The polymorphic `Result` class in C# is a reusable component that’s recommended for taming side effects in the case of functions that could generate exceptions. To indicate that a function can fail, the output is wrapped with a `Result` type. The following listing shows the previous `DownloadOptionImage` function refactored to follow the `Result` type model (in bold). The new function is named `DownloadResultImage`. Listing 10.9 `DownloadResultImage`: handling errors and preserving semantics ``` async **Task<Result<Image>>** DownloadResultImage(string blobReference) { try { CloudStorageAccount storageAccount = ➥ CloudStorageAccount.Parse("<Azure Connection>"); CloudBlobClient blobClient = ➥ storageAccount.CreateCloudBlobClient(); CloudBlobContainer container = ➥ blobClient.GetContainerReference("Media"); await container.CreateIfNotExistsAsync(); CloudBlockBlob blockBlob = container. ➥GetBlockBlobReference(blobReference); using (var memStream = new MemoryStream()) { await blockBlob.DownloadToStreamAsync(memStream). ➥ConfigureAwait(false); return Image.FromStream(memStream); ① } } catch (StorageException exn) { return exn; ① } catch (Exception exn) { return exn; ① } } ``` It’s important that the `Result` type provides the caller of the `DownloadResultImage` function the information necessary to handle each possible outcome in a tailored manner, including different error cases. In this example, because `DownloadResultImage` is calling a remote service, it also has the `Task` effect (for asynchronous operations) as well as the `Result` effect. In the Azure storage example (from Listing 10.9), when the current state of an image is retrieved, that operation will hit the online media storage. It’s recommended to make it asynchronous, as I’ve mentioned, so the `Result` type should be wrapped in a `Task`. The `Task` and `Result` effects are generally combined in FP to implement asynchronous operations with error handling. Before diving into how to use the `Result` and `Task` types in combination, let’s define a few helper functions to simplify the code. The static class `ResultExtensions` defines a series of useful HOFs for the `Result` type, such as `bind` and `map`, which are applicable for a convenient fluent semantic to encode common error-handling flows. For brevity purpose, in the following listing only the helper functions that treat the `Task` and `Result` types are shown (in bold). The other overloads are omitted, with the full implementation available in the code samples. Listing 10.10 `Task<Result<T>>` helper functions for compositional semantics ``` static class ResultExtensions { public static async **Task<Result<T>>** TryCatch<T>(Func<Task<T>> func) { try { return await func(); } catch (Exception ex) { return ex; } } static async **Task<Result<R>>** SelectMany<T, R>(this Task<Result<T>> ➥ resultTask, Func<T, Task<Result<R>>> func) { Result<T> result = await resultTask.ConfigureAwait(false); if (result.IsFailed) return result.Error; return await func(result.Ok); } static async **Task<Result<R>>** Select<T, R>(this Task<Result<T>> resultTask, ➥ Func<T, Task<R>> func) { Result<T> result = await resultTask.ConfigureAwait(false); if (result.IsFailed) return result.Error; return await func(result.Ok).ConfigureAwait(false); } static async **Task<Result<R>>** Match<T, R>(this Task<Result<T>> resultTask, ➥ Func<T, Task<R>> actionOk, Func<Exception, Task<R>> actionError) { Result<T> result = await resultTask.ConfigureAwait(false); if (result.IsFailed) return await actionError(result.Error); return await actionOk(result.Ok); } } ``` The `TryCatch` function wraps a given operation into a `try-catch` block to safeguard the code from any exceptions if a problem arises. This function is useful for lifting and combining any `Task` computation into a `Result` type. In the following code snippet, the function `ToByteArrayAsync` asynchronously converts a given image into a byte array: ``` Task<Result<byte[]>> ToByteArrayAsync(Image image) { return TryCatch(async () => { using (var memStream = new MemoryStream()) { await image.SaveImageAsync(memStream, image.RawFormat); return memStream.ToArray(); } }); } ``` The underlying `TryCatch` function ensures that regardless of the behavior in the operation, a `Result` type is returned which wraps either a successful (`Ok` byte array) or a failure (`Error` exception). The extension methods `Select` and `SelectMany`, part of the `ResultExtensions` class, are generally known in functional programming as, respectively, `Bind` (or `flatMap`) and `Map`. But in the context of .NET and specifically in C#, the names `Select` and `SelectMany` are the recommended terms because they follow the LINQ convention, which notifies the compiler that treats these functions as LINQ expressions to ease their composition semantic structure. Now, with the higher-order operators from the `ResultExtensions` class, it’s easy to fluently chain a series of actions that operate on the underlying `Result` value without leaving the context. The following listing shows how the caller of `DownloadResultImage` can handle the execution flow in the case of success or failure as well as chaining the sequence of operations (the code to note is in bold). Listing 10.11 Composing `Task<Result<T>>` operations in functional style ``` async Task<Result<byte[]>> **ProcessImage**(string nameImage, string destinationImage){ return await **DownloadResultImages**(nameImage) .**Map**(**async** image => await ToThumbnail(image)) ① .**Bind**(**async** image => await ToByteArrayAsync(image)) ① .**Tap**(**async** bytes => ① await File.WriteAllBytesAsync(destinationImage, ➥ bytes)); ② ``` As you can see from the `ProcessImage` function signature, providing documentation that a function might have error effects is one of the advantages of using the `Result` type. `ProcessImage` first downloads a given image from the Azure Blob storage, then converts it into a thumbnail format using the `Bind` operator, which checks the previous `Result` instance, and if it’s successful, executes the delegate passed in. Otherwise, the `Bind` operator returns the previous result. The `Map` operator also verifies the previous `Result` value and acts accordingly by extracting the byte array from the image. The chain continues until one of the operations fails. If failure occurs, then the other operations are skipped. Ultimately, the result byte array is saved in the destination path (`destinationImage`) specified, or a log is executed if an error occurred. Rather than handling failure individually on each call, you should add the failure handling at the end of the computation chain. This way, the failure-handling logic is at a predictable place in the code, making it easier to read and maintain. You should understand that if any of these operations fails, the rest of the tasks are bypassed and none executed until the first function that handles the error (figure 10.5). In this example, the error is handled by the function `Match` (with the lambda `actionError`). It’s important to perform compensation logic in case the call to a function isn’t successful.  Figure 10.5 The `Result` type handles the operations in a way that, if during each step there is a failure, the rest of the tasks are bypassed and not executed until the first function that handles the error. In this figure, if any of the validations throws an error, the rest of computation is skipped until the `Failure` handler (the `Error` circle). Because it’s both hard and inconvenient to extract the inner value of a `Result` type, use the composition mechanisms of functional error handling. These mechanisms force the caller to always handle both the success and the failure cases. Using this design of the `Result` type, the program flow is declarative and easy to follow. Exposing your intent is crucial if you want to increase readability of your code. Introducing the `Result` class (and the composite type `Task<Result<T>>`) helps to show, without side effects, if the method can fail or isn’t signaling something is wrong with your system. Furthermore, the type system becomes a helpful assistant for building software by specifying how you should handle both successful and failure outcomes. The `Result` type provides a conditional flow in a high-level functional style, where you pick a strategy for dealing with the error and register that strategy as a handler. When the lower-level code hits the error, it can then pick a handler without unwinding the call stack. This gives you more options. You can choose to cope with the problem and continue. ### 10.3.1 Modeling error handling in F# with Async and Result The previous section discussed the concept of `Task` and `Result` types combined for providing safe and declarative error handling in a functional style. In addition to the TPL, the asynchronous workflow computation expression in F# offers a more idiomatic functional approach. This section covers the recipe for taming exceptions by showing how to combine the F# `Async` type with the `Result` structure. Before looking in depth at the F# error-handling model for asynchronous operations, we should define the type structure necessary. First, to fit into the context of error handling (specifically), as explained in chapter 9, you should define a `Result<'a>` type alias over `Result<'a, exn>`, which assumes that the second case is always an exception (`exn`). This alias `Result<'a>` simplifies pattern matching and deconstruction over the `Result<'a, exn>` type: ``` Result<'TSuccess> = Result<'TSuccess, exn> ``` Second, the type construct `Async` has to wrap this `Result<'a>` structure to define a new type that’s used in concurrent operations to signal when an operation is completed. You need to treat `Async<'a>` and `Result<'a>` as a single type, which can be done easily using an alias types that acts as a combinatorial structure: ``` type AsyncResult<'a> = Async<Result<'a>> ``` The `AsyncResult<'a>` type carries the value of an asynchronous computation, with either a success or failure outcome. In the case of an exception, the error information is preserved. Conceptually, `AsyncResult` is a separate type. Now, taking inspiration from the `AsyncOption` type in section 10.2.2, define a helper function `AsyncResult.handler` to run a computation lifting the output into a `Result` type. For this purpose, the F# `Async.Catch` function denotes a perfect fit. The following listing shows a custom alternative representation of `Async.Catch`, called `AsyncResult.handler`. Listing 10.12 `AsyncResult` handler to catch and wrap asynchronous computations ``` module Result = let ofChoice value = ① match value with | Choice1Of2 value -> Ok value | Choice2Of2 e -> Error e module AsyncResult = let handler (operation:Async<'a>) : AsyncResult<'a> = async { let! result = Async.Catch operation ② return (Result.ofChoice result) } ③ ``` The F# `AsyncResult.handler` is a powerful operator that dispatches the execution flow in case of error. In a nutshell, the `AsyncResult.handler` runs the `Async.Catch` function in the background for error handling and uses the `ofChoice` function to map the product of the computation (`Choice<Choice1Of2, Choice2Of2> Discriminated Union`) to the `Result<'a>` DU cases, which then branch the result of the computation respectively to the `OK` or `Error` union. (`ofChoice` was introduced in chapter 9.) ### 10.3.2 Extending the F# AsyncResult type with monadic bind operators Before we go further, let’s define the monadic helper functions to deal with the `AsyncResult` type. Listing 10.13 HOF extending the `AsyncResult` type ``` module AsyncResult = let retn (value:'a) : AsyncResult<'a> = ➥ value |> Ok |> async.Return ① let map (selector : 'a -> Async<'b>) (asyncResult : AsyncResult<'a>) ➥ : AsyncResult<'b> = async { let! result = asyncResult match result with ② | Ok x -> return! selector x |> handler ③ | Error err -> return (Error err) } let bind (selector : 'a -> AsyncResult<'b>) (asyncResult ➥ : AsyncResult<'a>) = async { ④ let! result = asyncResult match result with ② | Ok x -> return! selector x | Error err -> return Error err } let bimap success failure operation = async { ⑤ let! result = operation match result with | Ok v -> return! success v |> handler ② | Error x -> return! failure x |> handler } ``` The `map` and `bind` higher-order operators are the general functions used for composition. These implementations are straightforward: * The `retn` function lifts an arbitrary value `'a` into an `AsyncResult<'a>` elevated type. * The `let!` syntax in the `map` operator extracts the content from the `Async` (runs it and awaits the result), which is the `Result<`'`a>` type. Then, the `selector` function is applied on a `Result` value contained in the `Ok` case using the `AsyncResult.handler` function, because the outcome of the computation can be success or failure. Ultimately, the result is returned wrapped in the `AsyncResult` type. * The function `bind` uses continuation passing style (CPS) to pass a function that will run a successful computation to further process the result. The continuation function `selector` crosses the two types `Async` and `Result` and has the signature `'a -> AsyncResult<'b>`. * If the inner `Result` is successful, then the continuation function `selector` is evaluated with the result. The `return!` syntax means that the return value is already lifted. * If the inner `Result` is a failure, then the failure of the async operation is lifted. * The return syntax in `map`, `retn`, and `bind` lifts the `Result` value to an `Async` type. * The `return!` syntax in `bind` means that the value is already lifted and not to call `return` on it. * The `bimap` function aims to execute the asynchronous operation `AsyncResult` and then branches the execution flow to one of the continuation functions, either `success` or `failure`, according to the result. Alternatively, to make the code more succinct, you can use the built-in function `Result.map` to turn a value into a function that works on a `Result` type. Then, if you pass the output to `Async.map`, the resulting function works on an asynchronous value. Using this compositional programming style, for example, the `AsyncResult` `map` function can be rewritten as follows: ``` module AsyncResult = let map (selector : 'a -> 'b) (asyncResult : AsyncResult<'a>) = asyncResult |> **Async.map** (**Result.map** selector) ``` This programming style is a personal choice, so you should consider the tradeoff between succinct code and its readability. #### The F# AsyncResult higher-order functions in action Let’s see how to perform the `AsyncResult` type and its HOFs `bind`, `map`, and `return`. Let’s convert the C# code in Listing 10.7 that downloads an image from Azure Blob storage into an idiomatic F# way to handle errors in an asynchronous operation context. We stay with the Azure Blob storage example to simplify the understanding of the two approaches with a direct comparison by converting a function that you’re already familiar with (figure 10.6).  Figure 10.6 The validation logic can be composed fluently with minimum effort, by applying the higher-order operators `bind` and `bimap`. Furthermore, at the end of the pipeline, the `bimap` function pattern-matches the `Result` type to dispatch the continuation logic to either the `success` or `failure` branch in a convenient and declarative style. This listing shows the `processImage` function implemented using the F# `AsyncResult` type with its higher-order compositional operators (in bold). Listing 10.14 Using `AsyncResult` HOFs for fluent composition ``` let processImage(blobReference:string) (destinationImage:string) ➥ : AsyncResult<unit> = async { let storageAccount = CloudStorageAccount.Parse("< Azure Connection >") let blobClient = storageAccount.CreateCloudBlobClient() let container = blobClient.GetContainerReference("Media") let! _ = container.CreateIfNotExistsAsync() let blockBlob = container.GetBlockBlobReference(blobReference) use memStream = new MemoryStream() do! blockBlob.DownloadToStreamAsync(memStream) return Bitmap.FromStream(memStream) } |> **AsyncResult.handler** ① |> **AsyncResult.bind**(fun image -> toThumbnail(image)) ① |> **AsyncResult.map**(fun image -> toByteArrayAsync(image)) ① |> **AsyncResult.bimap** ① (fun bytes -> FileEx.WriteAllBytesAsync(destinationImage, bytes)) (fun ex -> logger.Error(ex) |> **Async****Result****.****r****et****n**) ② ``` The behavior of `processImage` is similar to the related C# method `processImage` from Listing 10.7; the only difference is the result definition `AsyncResult` type. Semantically, due to the intrinsic F# (`|>`) pipe operator, the `AsyncResult` functions `handler`, `bind`, `map`, and `bimap` are chained in a fluent style, which is the nearest equivalent to the concept of fluent interfaces (or method chaining) used in the C# version of the code. #### Raising the abstraction of the F# AsyncResult with computation expression Imagine that you want to further abstract the syntax from code Listing 10.12 so you can write `AsyncResult` computations in a way that can be sequenced and combined using control flow constructs. In chapter 9, you built custom F# *computational expressions* (CEs) to retry asynchronous operations in case of errors. CEs in F# are a safe way of managing the complexity and mutation of state. They provide a convenient syntax to manage data, control, and side effects in functional programs. In the context of asynchronous operations wrapped into an `AsyncResult` type, you can use CEs to handle errors elegantly to focus on the happy path. With the `AsyncResult` monadic operators `bind` and `return` in place, implementing the related computation expression requires minimal effort to achieve a convenient and fluid programming semantic. Here, the code listing defines the monadic operators (in bold) for the computation builder that combines the `Result` and `Async` types: ``` Type AsyncResultBuilder () = Member x.**Return** m = **AsyncResult.retn** m member x.**Bind** (m, f:'a -> **AsyncResult**<'b>) = **AsyncResult**.**bind** f m member x.**Bind** (m:Task<'a>, f:'a -> **AsyncResult**<'b>) = **AsyncResult**.**bind** f (m |> **Async**.**AwaitTask** |> **AsyncResult**.**handler**) Ember x.**ReturnFrom** m = m Let asyncResult = **AsyncResultBuilder**() ``` You can add more members to the `AsyncResultBuilder` CE if you need support for more advanced syntax; this is the minimal implementation required for the example. The only line of code that requires a clarification is the `Bind` with `Task<'a>` type: ``` member x.**Bind** (m:Task<'a>, f) = AsyncResult.**bind** f (m |> **Async.AwaitTask** ➥ |> AsyncResult.**handler**) ``` In this case, as explained in section 9.3.3, the F# CE lets you inject functions to extend the manipulation to other wrapper types, in this case `Task`, whereas the `Bind` function in the extension lets you fetch the inner value contained in the elevated type using the `let!` and `do!` operators. This technique removes the need for adjunctive functions such as `Async.AwaitTask`. The downloadable source code of this book contains a more complete implementation of the `AsyncResultBuilder` CE, but the extra CE implementation details aren’t relevant or part of this book’s scope. A simple CE deals with asynchronous calls that return a `Result` type and can be*useful for performing computations that may fail and then chain the results together. Let’s transform, once again, the `processImage` function, but this time the computation is running inside the `AsyncResultBuilder` CEs, as shown in bold in this listing.* *Listing 10.15 Using `AsyncResultBuilder` ``` let processImage (blobReference:string) (destinationImage:string) ➥ : AsyncResult<unit> = **asyncResult** { ① let storageAccount = CloudStorageAccount.Parse("<Azure Connection>") let blobClient = storageAccount.CreateCloudBlobClient() let container = blobClient.GetContainerReference("Media") let**!** _ = container.CreateIfNotExistsAsync() let blockBlob = container.GetBlockBlobReference(blobReference) use memStream = new MemoryStream() do**!** blockBlob.DownloadToStreamAsync(memStream) let image = Bitmap.FromStream(memStream) let**!** thumbnail = toThumbnail(image) return**!** toByteArrayAsyncResult thumbnail } |> **AsyncResult.bimap** (fun bytes -> ➥ FileEx.WriteAllBytesAsync(destinationImage, bytes)) (fun ex -> logger.Error(ex) |> async.Return.retn) ``` Now, all you need do is wrap the operations inside an `asyncResult` CE block. The compiler can recognize the monadic (CE) pattern and treats the computations in a special way. When the `let!` bind operator is detected, the compiler automatically translates the `AsyncResult.Return` and `AsyncResult.Bind` operations of a CE in context. ## 10.4 Abstracting operations with functional combinators Let’s say you need to download and analyze the history of a stock ticker symbol, or you decide you need to analyze the history of more than one stock to compare and contrast the best ones to buy. It’s a given that downloading data from the internet is an I/O-bound operation that should be executed asynchronously. But suppose you want to build a more sophisticated program, where downloading the stock data depends on other asynchronous operations (figure 10.7). Here are several examples: * If either the NASDAQ or the NYSE index is positive * If the last six months of the stock has a positive trend * If the volume of the stock compiles any number of positive criteria to buy  Figure 10.7 This diagram represents a sequential decision tree for buying stock. Each step likely involves an I/O operation to asynchronously interrogate an external service. You must be thoughtful in your approach to maintain this sequential flow, while performing the whole decision tree asynchronously. What about running the flow in figure 10.7 for each stock symbol that you’re interested in? How would you combine the conditional logic of these operations while keeping the asynchronous semantic to parallelize the execution? How would you design the program? The solution is *functional asynchronous combinators*. The following sections cover the characteristics of functional combinators, with the focus on asynchronous combinators. We’ll cover how to use the built-in support in the .NET Framework and how to build and tailor your own asynchronous combinators to maximize the performance of your program using a fluid and declarative functional programming style. ## 10.5 Functional combinators in a nutshell The imperative paradigm uses procedural control mechanisms such as `if-else` statements and `for/while` loops to drive a program’s flow. This is contrary to the FP style. As you leave the imperative world behind, you’ll learn to find alternatives to fill in that gap. A good solution is to use function combinators that orchestrate the flow of the program. FP mechanisms make it easy to combine two or more solutions from smaller problems into a single abstraction that solves a larger problem. Abstraction is a pillar of FP, which allows you to develop an application without worrying about the implementation details, allowing you to focus on the more important high-level semantics of the program. Essentially, abstraction captures the core of what a function or a whole program does, making it easier to get things done. In FP, a *combinator* refers to either a function with no free variables ([`wiki.haskell.org/Pointfree`](https://wiki.haskell.org/Pointfree)) or a pattern for composing and combining any types. This second definition is the central topic of this section. From a practical viewpoint, *functional combinators* are programming constructs that allow you to merge and link primitive artifacts, such as other functions (or other combinators), and behave as pieces of control logic together, working to generate more-advanced behaviors. In addition, functional combinators encourage modularity, which supports the objective to abstract functions into components that can be understood and reused independently, with codified meaning derived from rules governing their composition. You were introduced to this concept previously with the definition of asynchronous functions (combinators) such as `Otherwise` and `Retry` for the C# Task-based Asynchronous Programming (TAP) model and the F# `AsyncResult.handler`. In the context of concurrent programming, the main reason to use combinators is to implement a program that can handle side effects without compromising a declarative and compositional semantic. This is possible because combinators abstract away from the developer implementation details that might handle side effects underneath, with the purpose of offering functions that compose effortlessly. Specifically, this section covers combinators that compose asynchronous operations. If the side effects are limited to the scope of a single function, then the behavior calling that function is idempotent. *Idempotent* means the operation can be applied multiple times without changing the result beyond the initial application—the effect doesn’t change. It’s possible to chain these idempotent functions to produce complex behaviors where the side effects are isolated and controlled. ### 10.5.1 The TPL built-in asynchronous combinators The F# asynchronous workflow and the .NET TPL provide a set of built-in combinators, such as `Task.Run`, `Async.StartWithContinuation`, `Task.WhenAll`, and `Task.WhenAny`. These can be easily extended for implementing useful combinators to compose and build more sophisticated task-based patterns. For example, both the `Task.WhenAll` and the F# `Async.Parallel` operators are used to asynchronously wait on multiple asynchronous operations; the underlying results of those operations are grouped to continue. This continuation is the key that provides opportunities for composing the flow of a program in more complex structures, such as implementing the Fork/Join and Divide and Conquer patterns. Let’s start with a simple case in C# to understand the benefits of combinators. Imagine you must run three asynchronous operations and calculate the sum of their output, awaiting each in turn. Note that each operation takes one second to compute: ``` async Task<int> A() { await Task.Delay(1000); return 1; } async Task<int> B() { await Task.Delay(1000); return 3; } async Task<int> C() { await Task.Delay(1000); return 5; } int a = await A(); int b = await B(); int c = await C(); int result = a + b + c; ``` The result (9) is computed in three seconds, one second for each operation. But what if you want to run those three methods in parallel? To run more than one background task, there are methods available to help you coordinate them. The simplest solution to run multiple tasks concurrently is to start them consecutively and collect references to them. The TPL `Task.WhenAll` operator accepts a `params` array of tasks, and returns a task that is signaled when all the others are complete. You can eliminate the intermediate variables from that last example to make the code less verbose: ``` var results = (await Task.WhenAll(A(), B(), C())).Sum(); ``` The results come back in an array, and then the `Sum()` LINQ operator is applied. With this change, the result is computed in only one second. Now the task can completely represent an asynchronous operation and provide synchronous and asynchronous capabilities for joining with the operation, retrieving its results, and so on. This lets you build useful libraries of combinators that compose tasks to build larger patterns. ### 10.5.2 Exploiting the Task.WhenAny combinator for redundancy and interleaving A benefit of using tasks is that they enable powerful composition. Once you have a single type capable of representing any arbitrary asynchronous operation, you can write combinators over the type that allow you to combine/compose asynchronous operations in myriad ways. For example, the TPL `Task.WhenAny` operator allows you to develop parallel programs where one task of multiple asynchronous operations must be completed before the main thread can continue processing. This behavior of asynchronously waiting for the first operation to complete over a given set of tasks, before notifying the main thread for further processing, facilitates the design of sophisticated combinators. Redundancy, interleaving, and throttling are examples of properties that are derived from these combinators. Consider the case where you want to buy an airplane ticket as soon as possible. You have a few airline web services to contact, but depending on web traffic, each service can have a different response time. In this case, you can use the `Task.WhenAny` operator to contact multiple web services to produce a single result, selected from the one that completes the fastest. Listing 10.16 Redundancy with `Task.WhenAny` ``` var cts = new CancellationTokenSource(); ① Func<string, string, string, CancellationToken, Task<string>> ➥ GetBestFlightAsync = async (from, to, carrier, token) => { ② string url = $"flight provider{carrier}"; using(var client = new HttpClient()) { HttpResponseMessage response = await client.GetAsync(url, token); return await response.Content.ReadAsStringAsync(); }}; var recommendationFlights = new List<Task<string>>() ③ { GetBestFlightAsync("WAS", "SF", "United", cts.Token), GetBestFlightAsync("WAS", "SF", "Delta", cts.Token), GetBestFlightAsync("WAS", "SF", "AirFrance", cts.Token), }; Task<string> recommendationFlight = await Task.WhenAny(recommendationFlights); ④ while (recommendationFlights.Count > 0) { try { var recommendedFlight = await recommendationFlight; ⑤ cts.Cancel(); ⑥ BuyFlightTicket("WAS", "SF", recommendedFlight); break; } catch (WebException) ⑤ { recommendationFlights.Remove(recommendationFlight); ⑤ } } ``` In the code, `Task.WhenAny` returns the task that completed first. It’s important to know if the operation completes successfully, because if there’s an error you want to discharge the result and wait for the next computation to complete. The code must handle exceptions using a `try-catch`, where the computation that failed is removed from the list of asynchronous recommended operations. When a first task completes successfully, you want to be sure to cancel the others still running. ### 10.5.3 Exploiting the Task.WhenAll combinator for asynchronous for-each The `Task.WhenAll` operator waits asynchronously on multiple asynchronous computations that are represented as tasks. Consider that you want to send an email message to all your contacts. To speed up the process, you want to send the email to all recipients in parallel without waiting for each separate message to complete before sending the next. In such a scenario, it would be convenient to process the list of emails in a `for-each` loop. How would you maintain the asynchronous semantic of the operation, while sending the emails in parallel? The solution is to implement a `ForEachAsync` operator based on the `Task.WhenAll` method. Listing 10.17 Asynchronous `for-each` loop with `Task.WhenAll` ``` static Task ForEachAsync<T>(this IEnumerable<T> source, ➥ int maxDegreeOfParallelism, Func<T, Task> body) { return Task.WhenAll( from partition in Partitioner.Create(source).GetPartitions(maxDegreeOfParallelism) select Task.Run(async () => { using (partition) while (partition.MoveNext()) await body(partition.Current); })); } ``` For each partition of the enumerable, the operator `ForEachAsync` runs a function that returns a `Task` to represent the completion of processing that group of elements. Once the work starts asynchronously, you can achieve concurrency and parallelism, invoking the body for each element and waiting on them all at the end, rather than waiting for each in turn. The `Partitioner` created limits the number of operations that can run in parallel to avoid making more tasks than necessary. This maximum degree of parallelism value is managed by partitioning the input data set into `maxDegreeOfParallelism` number of chunks and scheduling a separate task to begin execution for each partition. The `ForEachAsync` batches work to create fewer tasks than total work items. This can provide significantly better overall performance, especially if the loop body has a small amount of work per item. Now you can use the `ForEachAsync` operator to send multiple emails asynchronously. Listing 10.18 Using the asynchronous `for``-``each` loop ``` async Task SendEmailsAsync(List<string> emails) { SmtpClient client = new SmtpClient(); Func<string, Task> sendEmailAsync = async emailTo => { MailMessage message = new MailMessage("me@me.com", emailTo); await client.SendMailAsync(message); }; await emails.ForEachAsync(Environment.ProcessorCount, sendEmailAsync); } ``` These are a few simple examples that show how to use the built-in TPL combinators `Task.WhenAll` and `Task.WhenAny`. In section 10.6, you’ll focus on constructing custom combinators and composing existing ones in which both F# and C# principles apply. You’ll see that there’s an infinite number of combinators. We’ll look at several of the most common ones that are used to implement an asynchronous logical flow in a program: `ifAsync`, `AND` (async), and `OR` (async). Before jumping into building asynchronous combinators, let’s review the functional patterns that have been discussed so far. This refresher will lead to a new functional pattern, which is used to compose heterogeneous concurrent functions. Don’t worry if you aren’t familiar with this term; you will be shortly. ### 10.5.4 Mathematical pattern review: what you’ve seen so far In the previous chapters, I introduced the concepts of monoids, monads, and functors, which come from a branch of mathematics called *category theory*. Additionally, I discussed their important relationship to functional programming and functional concurrency. In programming, these mathematical patterns are adopted to control the execution of side effects and to maintain functional purity. These patterns are interesting because of their properties of abstraction and compositionality. Abstraction favors composability, and together they’re the pillars of functional and concurrent programming. The following sections rehash the definition of these mathematical concepts. #### Monoids for data parallelism A monoid, as explained earlier, is a binary associative operation with an identity; it provides a way to mash values of the same type together. The associative property allows you to run a computation in parallel effortlessly by providing the ability to divide a problem into chunks so it can be computed independently. Then, when each block of computation completes, the result is recomposed. A variety of interesting parallel operations turn out to be both associative and commutative, expressed using monoids: `Map-Reduce` and `Aggregation` in various forms such as `sum`, `variance`, `average`, `concatenation`, and more. The .NET PLINQ, for example, uses monoidal operations that are both associative and commutative to parallelize work correctly. The following code example, based on content from chapter 4, shows how to use PLINQ for parallelizing the `sum` of the power of an array segment. The data set is partitioned in subarrays that are accumulated separately on their own threads using the accumulator initialized to the seed. Ultimately, all accumulators will be combined using the final reduce function (the `AsParallel` function is in bold): ``` var random = new Random(); var size = 1024 * Environment.ProcessorCount; int[] array = Enumerable.Range(0, size).Select(_ => ➥ random.Next(0, size)).ToArray(); long parallelSumOfSquares = array.**AsParallel**() .Aggregate( seed: 0, ① updateAccumulatorFunc: (partition, value) => ➥ partition + (int)Math.Pow(value, 2), combineAccumulatorsFunc: (partitions, partition) => ➥ partitions + partition, resultSelector: result => result); ``` Despite the unpredictable order of the computation compared to the sequential version of the code, the result is deterministic because of the associativity and commutativity properties of the `+` operator. #### Functors to map elevated types The functor is a pattern of mapping over elevated structures, which is archived and provides support for a two-parameter function called `Map` (also known as `fmap`). The type signature of the `Map` function takes as a first argument the function `(T -> R),` which in C# is translated into `Func<T, R>`. When given an input type `T`, it applies a transformation and returns a type `R`. A functor elevates functions with only one input. The LINQ/PLINQ `Select` operator can be considered a functor for the `IEnumerable` elevated type. Mainly, functors are used in C# to implement LINQ-style fluent APIs that are used for types other than collections. In chapter 7, you implemented a functor for the `Task` elevated type (the `Map` function is in bold): ``` static Task<R> **Map**<T, R>(this Task<T> input, Func<T, R> map) => input.ContinueWith(t => map(t.Result)); ``` The function `Map` takes a function map (`T -> R`) and a functor (wrapped context) `Task<T>` and returns a new functor `Task<R>` containing the result of applying the function to the value and closing it once more. The following code, from chapter 8, downloads an icon image from a given website and converts it into a bitmap. The operator `Map` is applied to chain the asynchronous computations (the code to note is in bold). ``` Bitmap icon = **await** new HttpClient() .GetAsync($"http://{domain}/favicon.ico") .Bind(**async** content => **await** content.Content.ReadAsByteArrayAsync()) .**Map**(bytes => Bitmap.FromStream(new MemoryStream(bytes))); ``` This function has a signature `(T -> R) -> Task<T> -> Task<R>`, which means that it takes a `map` function `T -> R` as the first input that goes from a value type `T` to a value type `R`, and then upgrades the type `Task<T>` as a second input and returns the `Task<R>`. A functor is nothing more than a data structure that you can use to map functions with the purpose of lifting values into a wrapper (elevated type), modifying them, and then putting them back into a wrapper. The reason for having `fmap` return the same elevated type is to continue chaining operations. Essentially, functors create a context or an abstraction that allows you to securely manipulate and apply operations to values without changing any original values. #### Monads to compose without side effects Monads are a powerful compositional tool used in functional programming to avoid dangerous and unwanted behaviors (side effects). They allow you to take a value and apply a series of transformations in an independent manner encapsulating side effects. The type signature of monadic function calls out potential side effects, providing a representation of both the result of the computation and the actual side effects that occurred as a result. A monadic computation is represented by generic type `M<'a>` where the type parameter specifies the type of value (or values) produced as the result of monadic computation (internally, the type may be a `Task` or `List`*,* for example). When writing code using monadic computations, you don’t use the underlying type directly. Instead you use two operations that every monadic computation must provide: `Bind` and `Return`. These operations define the behavior of the monad and have the following type signatures (for certain monads of type `M<'a>` that could be replaced with `Task<'a>`): ``` Bind: ('a -> M<'b>) -> M<'a> -> M<'b> Return: 'a -> M<'a> ``` The `Bind` operator takes an instance of an elevated type, extracts the underlying value from it, and runs the function over that value, returning a new elevated value: ``` Task<R> **Bind**<R, T>(this Task<T> task, Func<T, Task<R>> continuation) ``` You can see in this implementation that the `SelectMany` operator is built into the LINQ/PLINQ library. `Return` is an operator that lifts (wraps) any type into a different elevated context (monad type, like `Task`), usually converting a non-monadic value into a monadic value. For example, `Task.FromResult` produces a `Task<T>` from any given type `T` (in bold): ``` Task<T> **Return**<T>(T value) => Task.FromResult(value); ``` These monadic operators are essential to LINQ/PLINQ and generate the opportunity for many other operators. For example, the previous code that downloads and converts an icon from a given website into a bitmap format can be rewritten using the monadic operators (in bold) in the following manner: ``` Bitmap icon = **await** (**from** content in new HttpClient().GetAsync($"http://{domain}/favicon.ico") **from** bytes in content.Content.ReadAsByteArrayAsync()) **select** Bitmap.FromStream(new MemoryStream(bytes)); ``` The monad pattern is an amazingly versatile pattern for doing function composition with amplifying types while maintaining the ability to apply functions to instances of the underlying types. Monads also provide techniques for removing repetitive and awkward code and can allow you to significantly simplify many programming problems. ## What is the importance of laws? As you’ve seen, each of the mathematical patterns mentioned must satisfy specific laws to expose their property, but why? The reason is, laws help you to reason about your program, providing information for the expected behavior of the type in context. Specifically, a concurrent program must be deterministic; therefore, a deterministic and predictable way to reason about the code helps to prove its correctness. If an operation is applied to combine two monoids, then you can assume, due to the monoid laws, that the computation is associative, and the result type is also a monoid. To write concurrent combinators, it’s important to trust the laws that are derived from the abstract interface, such as monads and functors. ## 10.6 The ultimate parallel composition applicative functor At this point, I’ve discussed how a functor (`fmap`) can be used to upgrade functions with one argument to work with elevated types. You’ve also learned how the monadic `Bind` and `Return` operators are used to compose elevated types in a controlled and fluent manner. But there’s more! Let’s assume that you have a function from the *normal world*: for example, a method that processes an image to create a `Thumbnail` over a given `Bitmap` object. How would you apply such functionality to values from the *elevated world* `Task<Bitmap>`? Here’s the function `ToThumbnail` to process a given image (the code to note is in bold): ``` Image **ToThumbnail** (Image bitmap, int maxPixels) { var scaling = (bitmap.Width > bitmap.Height) ? maxPixels / Convert.ToDouble(bitmap.Width) : maxPixels / Convert.ToDouble(bitmap.Height); var width = Convert.ToInt32(Convert.ToDouble(bitmap.Width) * scaling); var heiht = Convert.ToInt32(Convert.ToDouble(bitmap.Height) * scaling); return new Bitmap(bitmap.GetThumbnailImage(width, height, null, ➥ IntPtr.Zero)); } ``` Although you can obtain a substantial number of different compositional shapes using core functions such as `map` and `bind`, there’s the limitation that these functions take only a single argument as an input. How can you integrate multiple-argument functions in your workflows, given that `map` and `bind` both take as input a unary function? The solution is *applicative functors*. Let’s start with a problem to understand the reasons why you should apply the Applicative Functor pattern (technique). The functor has the `map` operator to upgrade functions with one and only one argument. It’s common that functions that map to elevated types usually take more than one argument, such as the previous `ToThumbnail` method that takes an image as the first argument and the maximum size in pixels for the image transformation as the second argument. The problem with such functions is that they aren’t easy to elevate in other contexts. If you load an image, for simplicity using the Azure Blob storage function `DownloadImageAsync` as earlier, and later you want to apply the `ToThumbnail` function transformation, then the functor map cannot be used because the type signature doesn’t match. `ToThumbnail` (in bold in the following listing) takes two arguments, while the `map` function takes a single argument function as input. Listing 10.19 Compositional limitation of the `Task` functor map ``` Task<R> map<T, R>(this Task<T> task, Func<T, R> map) => task.ContinueWith(t => map(t.Result)); static async Task<Image> DownloadImageAsync(string blobReference) { var container = await Helpers.GetCloudBlobContainerAsync().ConfigureAwait(false); CloudBlockBlob blockBlob = container.GetBlockBlobReference(blobReference); using (var memStream = new MemoryStream()) { await blockBlob.DownloadToStreamAsync(memStream).ConfigureAwait(false); return Bitmap.FromStream(memStream); } } static async Bitmap CreateThumbnail(string blobReference, int maxPixels) { Image thumbnail = await DownloadImageAsync("Bugghina001.jpg") **.map(ToThumbnail);** ① return thumbnail; } ``` The problem with this code is that it doesn’t compile when you’re trying to apply `ToThumbnail` to the `Task` map extension method `map(ToThumbnail).` The compiler throws an exception due to the signature mismatch. How can you apply a function to several contexts at once? How can a function that takes more than one argument be upgraded? This is where applicative functors come into play to apply a multi-parameter function over an elevated type. The following listing exploits applicative functors to compose the `ToThumbnail` and `DownloadImageAsync` functions, matching the type signature and maintaining the asynchronous semantic (in bold). Listing 10.20 Better composition of the asynchronous operation ``` Static Func<T1, Func<T2, TR>> Curry<T1, T2, TR>(this Func<T1, T2, TR> func) => ➥ p1 => p2 => func(p1, p2); static async Task<Image> CreateThumbnail(string blobReference, int maxPixels) { Func<Image, Func<int, Image>> ToThumbnailCurried = ➥ Curry<Image, int, Image>(ToThumbnail); ① Image thumbnail = await **TaskEx.Pure**(ToThumbnailCurried) ② .**Apply**(DownloadImageAsync(blobReference)) ③ .**Apply**(TaskEx.Pure(maxPixels)); ③ return thumbnail; } ``` Let’s explore this listing for clarity. The `Curry` function is part of a helper static class, which is used to facilitate FP in C#. In this case, the curried version of the method `ToThumbnail` is a function that takes an image as input, and returns a function that takes an integer (`int`) as input for the maximum size in pixels allowed, and as output an `Image` type: `Func<Image, Func<int, Image>> ToThumbnailCurried`. Then, this unary function is wrapped in the container `Task` type, and overloads so greater arities can be defined by currying that function. In practice, the function that takes more than one argument, in this case `ToThumbnail`, is curried and lifted into the `Task` type using the `Task Pure` extension method. Then, the resulting `Task<Func<Image, Func<int, Image>>>` is passed over the applicative functor `Apply`, which injects its output, `Task<Image>`, into the next function applied over `DownloadImageAsync.` Ultimately, the last applicative functor operator `Apply` handles the transient parameter `maxPixels` elevated using the `Pure` extension method. From the perspective of the functor `map` operator, the curried function `ToThumbnailCurried` is partially applied and is exercised against an `image` argument and then wrapped into the task. Therefore, conceptually, the signature is ``` Task<ToThumbnailCurried(Image)> ``` The function `ToThumbnailCurried` takes an `image` as input and then returns the partially applied function in the form of a `Func<int, Image>` delegate, whose signature definition correctly matches the input of the applicative functor: `Task<Func<int, Image>>.` The `Apply` function can be viewed as a partial application for elevated functions, whose next value is provided for every call in the form of an elevated (boxed) value. In this way, you can turn every argument of a function into a boxed value. The Applicative Functor pattern aims to lift and apply a function over an elevated context, and then apply a computation (transformation) to a specific elevated type. Because both the value and the function are applied in the same elevated context, they can be smashed together. Let’s analyze the functions `Pure` and `Apply`. An applicative functor is a pattern implemented by two operations, defined here, where `AF` represents any elevated type (in bold): ``` **Pure** : T -> AF<R> **Apply** : AF<T -> R> -> AF<T> -> F<R> ``` Intuitively, the `Pure` operator lifts a value into an elevated domain, and it’s equivalent to the `Return` monadic operator. The name `Pure` is a convention for an applicative functor definition. But in the case of applicative functions, this operator elevates a function. The `Apply` operator is a two-parameter function, both part of the same elevated domain. From the code example in the section “Functors to map elevated types,” you can see that an applicative functor is any container (elevated type) that offers a way to transform a normal function into one that operates on contained values. Applicative functors are useful when sequencing a set of actions in parallel without the need for any intermediate results. In fact, if the tasks are independent, then their execution can be composed and parallelized using an applicative. An example is running a set of concurrent actions that read and transform parts of a data structure in order, then combine their results, shown in figure 10.8.  Figure 10.8 The `Apply` operator implements the function wrapped inside an elevated type to a value in the context. The process triggers the unwrapping of both values; then, because the first value is a function, it’s applied automatically to the second value. Finally, the output is wrapped back inside the context of the elevated type. In the context of the `Task` elevated type, it takes a value `Task<T>` and a wrapped function `Task<(T -> R)>` (translated in C# as `Task<Func<T, R>>`) and then returns a new value `Task<R>` generated by applying the underlying function to the value of `Task<T>`: ``` static Task<R> **Apply**<T, R>(this Task<Func<T, R>> liftedFn, Task<T> task) { var tcs = new TaskCompletionSource<R>(); liftedFn.ContinueWith(innerLiftTask => task.ContinueWith(innerTask => tcs.SetResult(innerLiftTask.Result(innerTask.Result)) )); return tcs.Task; } ``` Here’s a variant of the `Apply` operator defined for `async Task` in the TAP world, which can be implemented rather than in terms of `async/await`: ``` static async Task<R> **Apply**<T, R> (this Task<Func<T, R>> f, Task<T> arg) => (await f. ConfigureAwait(false)) (await arg.ConfigureAwait(false)); ``` Both `Apply` functions have the same behavior despite their different implementations. The first input value of `Apply` is a function wrapped into a `Task`: `Task<Func<T, R>>`. This signature could look strange initially, but remember that in FP, functions are treated as values and can be passed around in the same way as strings or integers. Now, extending the `Apply` operator to a signature that accepts more inputs becomes effortless. This function is an example: ``` static Task<Func<b, c>> **Apply**<a, b, c>(this Task<Func<a, b, c>> liftedFn, ➥ Task<a> input) => **Apply**(liftedFn.**map**(**Curry**), input); ``` Notice that this implementation is clever, because it applies the `Curry` function to `Task<Func<a, b, c>>` `liftedFn` using the functor `map`, and then applies it over the elevated input value using the `Apply` operator with smaller arity as previously defined. With this technique, you continue to expand the `Apply` operator to take as an input a function lifted with any number of parameters. It turns out that functor and applicative functor work well together to facilitate composition, including the composition of expressions running in parallel. When passing a function with more than one argument to the functor `map`, the result type matches the input of the `Apply` function. You can use an alternative way to implement an applicative functor in terms of using the monadic operators `bind` and `return`. But this approach prevents the code from running in parallel, because the execution of an operation would depend on the outcome of the previous one. With the applicative functor in place, it’s effortless to compose a series of computations with no limit on the number of arguments each expression takes. Let’s imagine that you need to blend two images to create a third new image, which is the overlap of the given images into a frame having a specific size. This listing shows you how (the `Apply` function is in bold). Listing 10.21 Parallelizing the chain of computation with applicative functors ``` static Image BlendImages(Image imageOne, Image imageTwo, Size size) { var bitmap = new Bitmap(size.Width, size.Height); using (var graphic = Graphics.FromImage(bitmap)) { graphic.InterpolationMode = InterpolationMode.HighQualityBicubic; graphic.DrawImage(imageOne, new Rectangle(0, 0, size.Width, size.Height), new Rectangle(0, 0, imageOne.Width, imageTwo.Height), GraphicsUnit.Pixel); graphic.DrawImage(imageTwo, new Rectangle(0, 0, size.Width, size.Height), new Rectangle(0, 0, imageTwo.Width, imageTwo.Height), GraphicsUnit.Pixel); graphic.Save(); } return bitmap; } async Task<Image> BlendImagesFromBlobStorageAsync(string blobReferenceOne, ➥ string blobReferenceTwo, Size size) { Func<Image, Func<Image, Func<Size, Image>>> BlendImagesCurried = Curry<Image, Image, Size, Image>(BlendImages); Task<Image> imageBlended = TaskEx.**Pure**(BlendImagesCurried) .**Apply**(DownloadImageAsync(blobReferenceOne)) .**Apply**(DownloadImageAsync(blobReferenceTwo)) .**Apply**(TaskEx.**Pure**(size)); return await imageBlended; } ``` When you call `Apply` the first time, with the `DownloadImageAsync(blobReferenceOne)` task, it immediately returns a new `Task` without waiting for the `DownloadImageAsync` task to complete; consequently, the program immediately goes on to create the second `DownloadImageAsync(blobReferenceTwo)`. As a result, both tasks run in parallel. The code assumes that all the functions have the same input and output; but this is not a constraint. As long as the output type of an expression matches the input of the next expression, then the computation is still enforced and valid. Notice that in Listing 10.21, each call starts independently, so they run in parallel and the total execution time for `BlendImagesFromBlobStorageAsync` to complete is determined by the longest time required of the `Apply` calls to complete. This example enforces the compositional aspect of concurrent functions. Alternatively, you could implement custom methods that blend the images directly, but in the larger scheme this approach enables the flexibility to combine more sophisticated behaviors. ### 10.6.1 Extending the F# async workflow with applicative functor operators Continuing your introduction to applicative functors, in this section you’ll practice the same task concepts to extend the F# asynchronous workflow. Note that F# supports the TPL because it is part of the .NET ecosystem, and the applicative functors are based on the `Task` type applied. The following listing implements the two applicative functor operators `pure` and `apply`, which are purposely defined inside the `Async` module to extend this type. Note that because `pure` is a future reserved keyword in F#, the compiler will give a warning. Listing 10.22 F# async applicative functor ``` module Async = let **pure** value = async.Return value ① let **apply** funAsync opAsync = async { let! funAsyncChild = Async.StartChild funAsync ② let! opAsyncChild = Async.StartChild opAsync let! funAsyncRes = funAsyncChild let! opAsyncRes = opAsyncChild ③ return funAsyncRes opAsyncRes } ``` The `apply` function executes the two parameters, `funAsync` and `opAsync`, in parallel using the Fork/Join pattern, and then it returns the result of applying the output of the first function against the other. Notice that the implementation of the `apply` operator runs in parallel because each asynchronous function starts the evaluation using the `Async.StartChild` operator. Let’s see the capabilities that these functions provide in place. The same applicative functor concepts introduced in C# apply here; but the compositional semantic style provided in F# is nicer. Using the F# pipe (`|>`) operator to pass the intermediate result of a function on to the next one produces a more readable code. The following listing implements the same chain of functions using the applicative functor in F# for blending asynchronously two images, as shown in C# in Listing 10.21. In this case, the function `blendImagesFromBlobStorage` in F# returns an `Async` type rather than a `Task` (in bold). Listing 10.23 Parallel chain of operations with an F# async applicative functor ``` let blendImages (imageOne:Image) (imageTwo:Image) (size:Size) : Image = let bitmap = new Bitmap(size.Width, size.Height) use graphic = Graphics.FromImage(bitmap) graphic.InterpolationMode <- InterpolationMode.HighQualityBicubic graphic.DrawImage(imageOne, new Rectangle(0, 0, size.Width, size.Height), new Rectangle(0, 0, imageOne.Width, imageTwo.Height), GraphicsUnit.Pixel) graphic.DrawImage(imageTwo, new Rectangle(0, 0, size.Width, size.Height), new Rectangle(0, 0, imageTwo.Width, imageTwo.Height), GraphicsUnit.Pixel) graphic.Save() |> ignore bitmap :> Image let blendImagesFromBlobStorage (blobReferenceOne:string) ➥ (blobReferenceTwo:string) (size:Size) = **Async.apply**( **Async.apply**( **Async.apply**( **Async.``pure``** blendImages) (downloadOptionImage(blobReferenceOne))) (downloadOptionImage(blobReferenceTwo))) (**Async.``pure``** size) ``` The function `blendImages` is lifted to the `Task` world (elevated type) using the `Async.pure` function. The resulting function, which has the signature `Async<Image -> Image -> Size -> Image>`, is applied over the output of the functions `downloadOptionImage(blobReferenceOne)` and `downloadOptionImage(blobReferenceTwo)`. The lifted value `size` runs in parallel. As mentioned earlier, functions in F# are curried by default; the extra boilerplate required in C# isn’t necessary. Even if F# doesn’t support applicative functors as a built-in feature, it’s easy to implement the `apply` operator and exercise its compositional benefits. But this code isn’t particularly elegant, because the `apply` function operators are nested rather than chained. A better way is to create a custom infix operator. ### 10.6.2 Applicative functor semantics in F# with infix operators A more declarative and convenient approach in F# to write functional composition is to use custom infix operators. Unfortunately, this feature isn’t supported in C#. The support for custom infix operators means that you can define operators to achieve the desired level of precedence when operating over the arguments passed. An infix operator in F# is an operator that’s expressed using a mathematical notation called *infix notation*. For example, the multiply operator takes two numbers that are then multiplied by each other. In this case, using infix notation, the multiplication operator is written between the two numbers it operates on. Operators are basically two-argument functions, but in this case, instead of writing a function `multiply` `x` `y,` an infix operator is positioned between the two arguments: `x Multiply y`. You’re already familiar with a few infix operators in F#: the `|>` pipe operator and `>>` composition operators. But according to section 3.7 of the F# language specification, you can define your own operators. Here, an infix operator (in bold) is defined for both asynchronous functions `apply` and `map:` ``` let **(<*>)** = Async.apply let **(<!>)** = Async.map ``` Using these operators, you can rewrite the previous code in a more concise manner: ``` let blendImagesFromBlobStorage (blobReferenceOne:string) ➥ (blobReferenceTwo:string) (size:Size) = blendImages **<!>** downloadOptionImage(blobReferenceOne) **<*>** downloadOptionImage(blobReferenceOne) **<*>** Async.``pure`` size ``` In general, I recommend that you not overuse or abuse the utilization of infix operators, but instead find the right balance. You can see how, in the case of functors and applicative functors, the infix operator is a welcome feature. ### 10.6.3 Exploiting heterogeneous parallel computation with applicative functors Applicative functors lead to a powerful technique that allows you to write heterogeneous parallel computations. *Heterogeneous* means an object is composed of a series of parts of different kinds (versus *homogenous*, of a similar kind). In the context of parallel programming, it means executing multiple operations together, even if the result `type` between each operation is different. For example, with the current implementation, both the F# `Async.Parallel` and the TPL `Task.WhenAll` take as an argument a sequence of asynchronous computations having the same result type. This technique is based on the combination of applicative functors and the concept of lifting, which aims to elevate any type into a different context. This idea is applicable to values and functions; in this specific case, the target is functions with an arbitrary cardinality of different argument types. To enable this feature to run heterogeneous parallel computation, the applicative functor `apply` operator is combined with the technique of lifting a function. This combination is then used to construct a series of helpful functions generally called `Lift2`, `Lift3`, and so forth. The `Lift` and `Lift1` operators aren’t defined because they’re functor `map` functions. The following listing shows the implementation of the `Lift2` and `Lift3` functions in C#, which represents a transparent solution to performing parallel `Async` returning heterogeneous types. Those functions will be used next. Listing 10.24 C# asynchronous lift functions ``` static Task<R> Lift2<T1, T2, R>(Func<T1, T2, R > selector, Task<T1> item1, ➥ Task<T2> item2) ① { Func<T1, Func<T2, R>> curry = x => y => selector(x, y); ② var lifted1 = Pure(curry); ③ var lifted2 = Apply(lifted1, item1); ④ return Apply(lifted2, item2); ④ } static Task<R> Lift3<T1, T2, T3, R>(Func<T1, T2, T3, R> selector, ➥ Task<T1> item1, Task<T2> item2, Task<T3> item3) ① { Func<T1, Func<T2, Func<T3, R>>> curry = x => y => z => selector(x, y, z); ② var lifted1 = Pure(curry); ③ var lifted2 = Apply(lifted1, item1); ④ var lifted3 = Apply(lifted2, item2); ④ return Apply(lifted3, item3); ④ } ``` The implementation of the `Lift2` and `Lift3` functions is based on applicative functors that curry and elevate the function selector, enabling its applicability to the elevated argument types. The same concepts to implement the `Lift2` and `Lift3` functions affect the F# design. But due to the intrinsic functional feature of the programming language, and the conciseness provided by infix operators, the implementation of the lift functions (in bold) in F# is concise: `let` **lift2** (func:'a -> 'b -> 'c) (asyncA:Async<'a>) (asyncB:Async<'b>) = ``` func **<!>** asyncA **<*>** asyncB let **lift3** **(**func:'a -> 'b -> 'c -> 'd) (asyncA:Async<'a>) ➥ (asyncB:Async<'b>) (asyncC:Async<'c>) = func **<!>** asyncA **<*>** asyncB **<*>** asyncC ``` Due to the F# type inference system, the input values are wrapped into an `Async` type, and the compiler can interpret that the infix operators `<*>` and `<!>` are the functor and applicative functor in the context of the `Async` elevated type. Also, note that it’s convention in F# to start the module-level functions with a lowercase initial letter. ### 10.6.4 Composing and executing heterogeneous parallel computations What can you do with these functions in place? Let’s analyze an example that exploits these operators. Imagine you’re tasked to write a simple program to validate the decision to buy stock options based on a condition set by analyzing market trends and the history of the stocks. The program should be divided into three operations: 1. Check the total amount available for purchase based on the bank account available balance and the current price of the stock: 1. Fetch the bank account balance. 2. Fetch the stock price from the stock market. 2. Validate if a given stock symbol is recommended to buy: 1. Analyze the market indexes. 2. Analyze the historical trend of the given stock. 3. Given a stock ticker symbol, decide to buy or not buy a certain number of stock options based upon the money available calculated in step 1. The next listing shows the asynchronous functions to implement the program, which ideally should be combined (in bold). Certain code implementation details are omitted because they’re irrelevant for this example. Listing 10.25 Asynchronous operations to compose and run in parallel ``` let calcTransactionAmount amount (price:float) = ① let readyToInvest = amount * 0.75 let cnt = Math.Floor(readyToInvest / price) if (cnt < 1e-5) && (price < amount) then 1 else int(cnt) let rnd = Random() let mutable bankAccount = 500.0 + float(rnd.Next(1000)) let getAmountOfMoney() = async { return bankAccount ② } let getCurrentPrice symbol = async { ③ let! (_,data) = processStockHistory symbol ⑧ return data.[0].open' } let getStockIndex index = async { ④ let url = sprintf "http://download.finance.yahoo.com/d/quotes. ➥ csv?s=%s&f=snl1" index let req = WebRequest.Create(url) let! resp = req.AsyncGetResponse() use reader = new StreamReader(resp.GetResponseStream()) return! reader.ReadToEndAsync() } |> Async.map (fun (row:string) -> let items = row.Split(',') Double.Parse(items.[items.Length-1])) |> AsyncResult.handler ⑤ let analyzeHistoricalTrend symbol = **asyncResult** { ⑥ let! data = getStockHistory symbol (365/2) let trend = data.[data.Length-1] - data.[0] return trend } let withdraw amount = async { ⑦ return if amount > bankAccount then Error(InvalidOperationException("Not enough money")) else bankAccount <- bankAccount - amount Ok(true) } ``` Each operation runs asynchronously to evaluate the result of a different type. Respectively, the function `calcTransactionAmount` returns a hypothetical cost for the trade(buy) transaction, the function `analyzeHistoricalTrend` returns the value of the stock historical analysis that’s used to evaluate if the stock option is a recommended buy, the function `getStockIndex` returns the current value of the stock price, and the function `getCurrentPrice` returns the last stock price. How would you compose and run these computations in parallel using a Fork/Join pattern, for example, when the result type isn’t the same? A simple solution should be spawning an independent task for each function, then waiting for all tasks to complete to pass the results into a final function that aggregates the results and continues the work. It would be much nicer to glue all these functions together using a more generic combinator, which promotes reusability and, of course, better compositionality with a set of polymorphic tools. The following listing applies the technique to run heterogeneous computations in parallel using the `lift2` function in F# to evaluate how many stock options are recommended to buy after running a few simple diagnostics asynchronously (in bold). Listing 10.26 Running heterogeneous asynchronous operations ``` let howMuchToBuy stockId : **AsyncResult**<int> = Async.**lift2** (calcTransactionAmount) ① (getAmountOfMoney()) (getCurrentPrice stockId) |> **AsyncResult.handler** ② let analyze stockId = ③ **howMuchToBuy** stockId |> Async.**StartCancelable**(function ④ | Ok (total) -> printfn "I recommend to buy %d unit" total | Error (e) -> printfn "I do not recommend to buy now") ``` `howMuchToBuy` is a two-parameter function with an `AsyncResult<float>` type as output. The result type definition is from the output of the underlying function `calcTransactionAmount`, in which the `AsyncResult<float>` indicates either the success of the operation with the amount of stock to buy, or not to buy. The first argument of `stockId` is an arbitrary stock ticker symbol to analyze. The `howMuchToBuy` function uses the `lift2` operator and waits without blocking the two underlying async expressions `(getAmountOfMoney` and `getCurrentPrice`) to complete each computation. The `analyze` function executes `howMuchToBuy` to collect and output the recommended result. In this case, the execution is performed asynchronously using the `Async.StartCancelable` function defined in section 9.3.5. One of the many benefits of using applicative, functor, monads, and combinator is their reproducibility and common patterns (regardless of the technology used). This makes it easy to understand and create a vocabulary that can be used to communicate to the developers and express the intention of the code. ### 10.6.5 Controlling flow with conditional asynchronous combinators In general, it’s common to implement combinators by gluing other combinators together. Once you have a set of operators that can represent any arbitrary asynchronous operation, you can easily design new combinators over the type that allow you to combine and compose asynchronous operations in myriad different and sophisticated ways. There are limitless possibilities and opportunities to customize asynchronous combinators to respond to your needs. You could implement an asynchronous combinator that emulates an `if-else` statement equivalent to the imperative conditional logic, but how? The solution is found in the functional patterns: * Monoids can be used to create the `Or` combinator. * Applicative functors can be used to create the `And` combinator. * Monads chain asynchronous operations and glue combinatory. In this section, you’re going to define a few conditional asynchronous combinators and consume them to understand what capabilities are offered and the limited effort required. In fact, by using the combinators introduced so far, it’s a matter of composing them to achieve different behaviors. Furthermore, in the case of F# infix operators, it’s easy to use the feature to elevate and operate functions inline, avoiding the need for intermediate functions. For example, you’ve defined functions such as `lift2` and `lift3` by which it’s possible to apply heterogeneous parallel computation. You can abstract away the combination notion into conditional operators such as `IF`, `AND`, and `OR`. The following listing shows a few combinators that apply to the F# asynchronous workflow. Semantically, they’re concise and easy to compose due to the functional property of this programming language. But the same concepts can be ported into C# effortlessly, or perhaps by using the interoperability option (the code to note is in bold). Listing 10.27 Async-workflow conditional combinators ``` module AsyncCombinators = let inline **ifAsync** (predicate:**Async**<bool>) (funcA:Async<'a>) ➥ (funcB:Async<’a>) = **async.Bind**(predicate, fun p -> if p then funcA else funcB) let inline **iffAsync** (predicate:Async<'a -> bool>) (context:Async<'a>) = async { let! p = predicate <*> context return if p then Some context else None } let inline **notAsync** (predicate:**Async**<bool>) = **async.Bind**(predicate, not >> async.Return) let inline **AND** (funcA:Async<bool>) (funcB:Async<bool>) = ifAsync funcA funcB (async.Return false) let inline **OR** (funcA:**Async**<bool>) (funcB:**Async**<bool>) = ifAsync funcA (async.Return true) funcB let **(<&&>)**(funcA:Async<bool>) (funcB:Async<bool>) = AND funcA funcB let **(<||>)**(funcA:Async<bool>) (funcB:Async<bool>) = OR funcA funcB ``` The `ifAsync` combinator will take an asynchronous predicate and two arbitrary asynchronous operations as arguments, where only one of those computations will run according to the outcome of the predicate. This is a useful pattern to branch the logic of your asynchronous program without leaving the asynchronous context. The `iffAsync` combinator takes a HOF condition that verifies the given context. If the condition holds true, then it asynchronously returns the context; otherwise it returns `None` asynchronously. The combinators from the previous code may be applied in any combination before execution starts, and they act as syntactic sugar, by which the code looks the same as in the sequential case. Let’s analyze in more detail these logical asynchronous combinators for a better understanding of how they work. This knowledge is key to building your own custom combinators. #### The AND logical asynchronous combinator The asynchronous `AND` combinator returns the result after both functions `funcA` and `funcB` complete. This behavior is similar to `Task.WhenAll`, but it runs the first expression and waits for the result, then calls the second one and combines the results. If the evaluation is canceled, or fails, or returns the wrong result, then the other function will not run, applying a short-circuit logic. Conceptually, the `Task.WhenAll` operator previously described is a good fit to perform logical `AND` over multiple asynchronous operations. This operator takes a pair of iterators to a container of tasks or a variable number of tasks, and returns a single `Task` that fires when all the arguments are ready. The `AND` operator can be combined into chains as long as they all return the same type. Of course, it can be generalized and extended using applicative functors. Unless the functions have side effects, the result is deterministic and independent of order, so they can run parallel. #### The OR logical asynchronous combinator The asynchronous `OR` combinator works like the addition operator with monoid structure, which means that the operations must be associative. The `OR` combinator starts two asynchronous operations in parallel, waiting for the first one to complete. The same properties of the `AND` combinators apply here. The `OR` combinator can be combined into chains; but the result cannot be deterministic unless both function evaluations return the same type, and both are canceled. The combinator that acts like a logical `OR` of two asynchronous operations can be implemented using the `Task.WhenAny` operator, which starts the computations in parallel and picks the one that finishes first. This is also the basis of speculative computation, where you pitch several algorithms against each other. The same approach for building `Async` combinators can be applied to the `AsyncResult` type, which provides a more powerful way to define generic operations where the output depends on the success of the underlying operations. In other words, ``AsyncResult acts as two state flags, which can represent either a failure or a successful operation, where the latter provides the final value. Here are a few examples of `AsyncResult` combinators (in bold).`` ````Listing 10.28 `AsyncResult` conditional combinators ``` module AsyncResultCombinators = let inline **AND** (funcA:**AsyncResult**<'a>) (funcB:**AsyncResult**<'a>) ➥ : **AsyncResult**<_> = **asyncResult** { let! a = funcA let! b = funcB return (a, b) } let inline **OR** (funcA:**AsyncResult**<'a>) (funcB:**AsyncResult**<'a>) ➥ : **AsyncResult**<'a> = **asyncResult** { return! funcA return! funcB } let **(<&&>)** (funcA:AsyncResult<'a>) (funcB:AsyncResult<'a>) = AND funcA funcB let **(<||>)** (funcA:AsyncResult<'a>) (funcB:AsyncResult<'a>) = OR funcA funcB let **(<|||>)** (funcA:**AsyncResult**<bool>) (funcB:**AsyncResult**<bool>) = **asyncResult** { let! rA = funcA match rA with | true -> return! funcB | false -> return false } let **(<&&&>)** (funcA:**AsyncResult**<bool>) (funcB:**AsyncResult**<bool>) = **asyncResult** { let! (rA, rB) = funcA <&&> funcB return rA && rB } ``` The `AsyncResult` combinators, compared to the `Async` combinators, expose the logical asynchronous operators `AND` and `OR` that perform conditional dispatch over generic types instead of `bool` types. Here’s the comparison between the `AND` operators implemented for `Async` and `AsyncResult`: ``` let inline **AND** (funcA:Async<bool>) (funcB:Async<bool>) = ifAsync funcA funcB (async.Return false) let inline **AND** (funcA:**AsyncResult**<'a>) (funcB:**AsyncResult**<'a>) ➥ : **AsyncResult**<_> = **asyncResult** { let! a = funcA let! b = funcB return (a, b) } ``` The `AsyncResult` `AND` uses the `Result` discriminated union to treat the `Success` case as the true value, which is carried over to the output of the underlying function. ### 10.6.6 Asynchronous combinators in action In Listing 10.26, the stock ticker symbol was analyzed and a recommendation decided asynchronously to buy a given stock. Now you need to add the conditional check `if-else`, which behaves asynchronously using the `ifAsync` combinator: if the stock option is recommended to buy, then proceed with the transaction; otherwise it returns an error message. The code to note is in bold. Listing 10.29 `AsyncResult` conditional combinators ``` let gt (value:'a) (ar:AsyncResult<'a>) = **asyncResult** { ① let! result = ar return result > value } let doInvest stockId = let shouldIBuy = ② ((getStockIndex "^IXIC" |> gt 6200.0) **<|||>** ③ (getStockIndex "^NYA" |> gt 11700.0 )) **<&&&>** ((analyzeHistoricalTrend stockId) |> gt 10.0) ④ |> AsyncResult.defaultValue false ⑤ let buy amount = async { ⑥ let! price = getCurrentPrice stockId let! result = withdraw (price*float(amount)) return result |> Result.bimap (fun x -> if x then amount else 0) (fun _ -> 0) ⑦ } AsyncComb.ifAsync shouldIBuy ⑧ (buy <!> (howMuchToBuy stockId)) ⑨ (Async.retn <| Error(Exception("Do not do it now"))) ⑩ |> AsyncResult.handler ⑪ ``` In this code example, the `doInvest` function analyzes a given stock symbol, its historical trend, and the current stock market to recommend a trading transaction. This function `doInvest` combines asynchronous functions that operate as a whole to determine the recommendation. The function `shouldIBuy` applies the asynchronous `OR` logical operator to check if either the ^IXIC or ^NYA index is greater than a given threshold. The result is used as base value to evaluate if the current stock market is good for buying operations. If the result of the `shouldIBuy` function is successful (true), the asynchronous `AND` logical operator proceeds, executing the `analyzeHistoricalTrend` function, which returns the historical trend analysis of the given stock. Next, the `buy` function verifies that the bank account balance is sufficient to buy the desired stock options; otherwise it returns an alternative value or zero if the balance is too low. Ultimately, these functions are combined. The `ifAsync` combinator runs `shouldIBuy` asynchronously. According to its output, the code branches to either proceed with a buy transaction or return an error message. The purpose of the `map` infix operator `(<!>)` is to lift the function `buy` into the `AsyncResult` elevated type, which is then executed against the number of stocks recommended to purchase calculated by the function `howMuchToBuy`. ## Summary * Exposing your intent is crucial if you want to increase the readability of your code. Introducing the `Result` class helps to show if the method is a failure or success, removes unnecessary boilerplate code, and results in a clean design. * The `Result` type gives you an explicit, functional way to handle errors without introducing side effects (unlike throwing/catching exceptions), which leads to expressive and readable code implementations. * When you consider the execution semantics of your code, `Result` and `Option` fill a similar goal, accounting for anything other than the happy path when code executes. `Result` is the best type to use when you want to represent and preserve an error that can occur during execution. `Option` is better for when you wish to represent the existence or absence of a value, or when you want consumers to account for an error, but you don’t care about preserving that error. * FP unmasks patterns to ease composition of asynchronous operations through the support of mathematical patterns. For example, applicative functors, which are amplified functors, can combine functions with multiple arguments directly over elevated types. * Asynchronous combinators can be used to control the asynchronous execution flow of a program. This control of execution includes conditional logic. It’s effortless to compose a few asynchronous combinators to construct more sophisticated ones, such as the asynchronous versions of the `AND` and `OR` operators. * F# has support for infix operators, which can be customized to produce a convenient set of operators. These operators simplify the programming style to easily construct a very sophisticated chain of operations in a non-standard manner. * Applicatives and functors can be combined to lift conventional functions, whose execution against elevated types can be performed without leaving the context. This technique allows you to run in parallel a set of heterogeneous functions, whose outputs can be evaluated as a whole. * Using core functional functions, such as `Bind`, `Return`, `Map`, and `Apply`, makes it straightforward to define rich code behavior that composes, run in parallel, and performs applications in an elevated world that mimics conditional logic, such as`if-else`.````** **`````# 11 Applying reactive programming everywhere with agents **This chapter covers** * Using the message-passing concurrent model * Handling millions of messages per second * Using the agent programming model * Parallelizing a workflow and coordinating agents Web applications play an important role in our lives, from large social networks and media streaming to online banking systems and collaborative online gaming. Certain websites now handle as much traffic as the entire internet did less than a decade ago. Facebook and Twitter, two of the most popular websites, have billions of users each. To ensure that these applications thrive, concurrent connections, scalability, and distributed systems are essential. Traditional architectures from years past cannot operate under this high volume of requests. High-performance computing is becoming a necessity. The message-passing concurrent programming model is the answer to this demand, as evidenced by the increasing support for the message-passing model in mainstream languages such as Java, C#, and C++. The number of concurrent online connections will certainly continue to grow. The trend is shifting to physical devices that are interconnected, generating sophisticated and massive networks constantly operating and exchanging messages. It’s predicted that the *Internet of Things* (IoT) will expand to an installed base of 75 billion units by 2025 ([`mng.bz/wiwP`](http://mng.bz/wiwP)). The continual evolution of devices connected online is inspiring a revolution in how developers design the next generation of applications. The new applications will have to be non-blocking, fast, and capable of reacting to high volumes of system notifications. Events will control the execution of reactive applications. You’ll need a highly available and resource-efficient application able to adapt to this rapid evolution and respond to an infinitely increasing volume of internet requests. The event-driven and asynchronous paradigms are the primary architectural requirements for developing such applications. In this context, you’ll need asynchronous programming processed in parallel. This chapter is about developing responsive and reactive systems, starting with the exceptional message-passing programming model, a general-purpose concurrent one with particularly wide applicability. The message-passing programming model has several commonalities with the microservices architecture ([`microservices.io/`](http://microservices.io/)). You’ll use the agent-based concurrent programming style, which relies on message passing as a vehicle to communicate between small units of computations called *agents*. Each agent may own an internal state, with single-threaded access to guarantee thread safety without the need of any lock (or other any other synchronization primitive). Because agents are easy to understand, programming with them is an effective tool for building scalable and responsive applications that ease the implementation of advanced asynchronous logic. By the end of this chapter, you’ll know how to use asynchronous message-passing semantics in your applications to simplify and improve responsiveness and performance in your application. (If you are shaky on asynchronicity, review chapters 8 and 9.) Before we plunge into the technical aspects of the message-passing architecture and the agent model, let’s look at the reactive system, with an emphasis on the properties that make an application valuable in the reactive paradigm. ## 11.1 What’s reactive programming, and how is it useful? *Reactive* *programming* is a set of design principles used in asynchronous programming to create cohesive systems that respond to commands and requests in a timely manner. It is a way of thinking about systems’ architecture and design in a distributed environment where implementation techniques, tooling, and design patterns are components of a larger whole—a system. Here, an application is divided into multiple distinct steps, each of which can be executed in an asynchronous and non-blocking fashion. Execution threads that compete for shared resources are free to perform other useful work while the resource is occupied, instead of idling and wasting computational power. In 2013, reactive programming became an established paradigm with a formalized set of rules under the umbrella of the Reactive Manifesto ([www.reactivemanifesto.org/](http://www.reactivemanifesto.org/)), which describes the number of constituent parts that determine a reactive system. The Reactive Manifesto outlines patterns for implementing robust, resilient, and responsive systems. The reason behind the Reactive Manifesto is the recent changes to application requirements (table 11.1). Table 11.1 Comparison between the requirements for past and present applications | **Past requirements for applications** | **Present requirements for applications** | | --- | --- | | Single processors | Multicore processors | | Expensive RAM | Cheap RAM | | Expensive disk memory | Cheap disk memory | | Slow networks | Fast networks | | Low volume of concurrent requests | High volume of concurrent requests | | Small data | Big data | | Latency measured in seconds | Latency measured in milliseconds | In the past, you might have had only a few services running on your applications, with ample response time, and time available for systems to be offline for maintenance. Today, applications are deployed over thousands of services, and each can run on multiple cores. Additionally, users expect response times in milliseconds, as opposed to seconds, and anything less than 100% uptime is unacceptable. The Reactive Manifesto seeks to solve these problems by asking developers to create systems that have four properties. They must be responsive (react to users), resilient (react to failure), message-driven (react to events), and scalable (react to load). Figure 11.1 illustrates these properties and how they relate to each other.  Figure 11.1 According to the Reactive Manifesto, for a system to be called reactive, it must have four properties: it must be responsive (must react to users), resilient (react to failure), message-driven (react to events), and scalable (react to load). A system built using the manifesto’s requirements will: * Have a consistent response time regardless of the workload undertaken. * Respond in a timely fashion, regardless of the volume of requests coming in. This ensures that the user isn’t spending significant amounts of time idly waiting for operations to complete, thereby providing a positive user experience. This responsiveness is possible because reactive programming optimizes the use of the computing resources on multicore hardware, leading to better performance. Asynchronicity is one of the key elements of reactive programming. Chapters 8 and 9 cover the APM and how it plays an important role in building scalable systems. In chapter 14, you’ll build a complete server-side application that fully embraces this paradigm. A message-driven architecture is the foundation of reactive applications. *Message-driven* means that reactive systems are built on the premise of asynchronous message passing; furthermore, with a message-driven architecture, components can be loosely coupled. The primary benefit of reactive programming is that it removes the need for explicit coordination between active components in a system, simplifying the approach to asynchronous computation. ## 11.2 The asynchronous message-passing programming model In a typical synchronous application, you sequentially perform an operation with a request/response model of communication, using a procedure call to retrieve data or modify a state. This pattern is limited due to a blocking programming style and design that cannot be scaled or performed out of sequence. A message-passing-based architecture is a form of asynchronous communication where data is queued, to be processed at a later stage, if necessary. In the context of reactive programming, the message-passing architecture uses an asynchronous semantic to communicate between the individual parts of the system. As a result, it can handle millions of messages per second, producing an incredible boost to performance (figure 11.2).  Figure 11.2 The synchronous (blocking) communication is resource inefficient and easily bottlenecked. The asynchronous message-passing (reactive) approach reduces blocking risks, conserves valuable resources, and requires less hardware/infrastructure. The idea of message-passing concurrency is based on lightweight units of computation (or processes) that have exclusive ownership of state. The state, by design, is protected and unshared, which means it can be either mutable or immutable without running into any pitfalls due to a multithreaded environment (see chapter 1). In a message-passing architecture, two entities run in separate threads: the sender of a message and a receiver of the message.*The benefit of this programming model is that all issues of memory sharing and concurrent access are hidden inside the communication channel. Neither entity involved in the communication needs to apply any low-level synchronization strategies, such as locking. The message-passing architecture (message-passing concurrent model) doesn’t communicate by sharing memory, but instead communicates by sending messages.* *Asynchronous message passing decouples communication between entities and allows senders to send messages without waiting for their receivers. No synchronization is necessary between senders and receivers for message exchange, and both entities can run independently. Keep in mind that the sender cannot know when a message is received and handled by the recipient. The message-passing concurrent model can at first appear more complicated than sequential or even parallel systems, as you’ll see in the comparison in figure 11.3 (the squares represent objects, and arrows represent a method call or a message).  Figure 11.3 Comparison between task-based, sequential, and agent-based programming. Each block represents a unit of computation. In figure 11.3, each block represents a unit of work: * Sequential programming is the simplest with a single input and produces a single output using a single control flow, where the blocks are connected directly in a linear fashion, each task dependent on the completion of the previous task. * Task-based programming is similar to the sequential programming model, but it may MapReduce or Fork/Join the control flow. * Message-passing programming may control the execution flow because the blocks are interconnected with other blocks in a continuous and direct manner. Ultimately, each block sends messages directly to other blocks, non-linearly. This design can seem complex and difficult to understand at first. But because blocks are encapsulated into active objects, each message is passed independent of other messages, with no blocking or lag time. With the message-passing concurrent model, you can have multiple building blocks, each with an independent input and output, which can be connected. Each block runs in isolation, and once isolation is achieved, it’s possible to deploy the computation into different tasks. We’ll spend the rest of chapter on agents as the main tool for building message-passing concurrent models. ### 11.2.1 Relation with message passing and immutability By this point, it should be clear that immutability ensures increased degrees of concurrency. (Remember, an immutable object is an object whose state cannot be modified after it’s created.) Immutability is a foundational tool for building concurrent, reliable, and predictable programs. But it isn’t the only tool that matters. Natural isolation is also critically important, perhaps more so, because it’s easier to achieve in programming languages that don’t support immutability intrinsically. It turns out that agents enforce coarse-grained isolation through message passing. ### 11.2.2 Natural isolation Natural isolation is a critically important concept for writing lockless concurrent code. In a multithreaded program, isolation solves the problem of shared state by giving each thread a copied portion of data to perform local computation. With isolation, there’s no race condition, because each task processes an independent copy of its own data. The natural isolation or share-nothing approach is less complex to achieve than immutability, but both options represent orthogonal approaches and should be used in conjunction for reducing runtime overheads and avoiding race condition and deadlocks. ## 11.3 What is an agent? An *agent* is a single-thread unit of computation used to design concurrent applications based on message passing in isolation (share-nothing approach). These agents are lightweight constructs that contain a queue and can receive and process messages. In this case, lightweight means that agents have a small memory footprint as compared to spawning new threads, so you can easily spin up 100,000 agents in a computer without a hitch. Think of an agent as a process that has exclusive ownership of some mutable state, which can never be accessed from outside of the agent. Although agents run concurrently with each other, *within* a single agent everything is sequential. The isolation of the agent’s internal state is a key concept of this model, because it is completely inaccessible from the outside world, making it thread safe. Indeed, if state is isolated, mutation can happen freely. An agent’s basic functionality is to do the following: * Maintain a private state that can be accessed safely in a multithreaded environment * React to messages differently in different states * Notify other agents * Expose events to subscribers * Send a reply to the sender of a message One of the most important features of agent programming is that messages are sent asynchronously, and the sender doesn’t initiate a block. When a message is sent to an agent, it is placed in a mailbox. The agent processes one message at a time sequentially in the order in which it was added to the mailbox, moving on to the next message only when it has finished processing the current message. While an agent processes a message, the other incoming messages aren’t lost, but are buffered into the internal isolated mailbox. Consequently, multiple agents can run in parallel effortlessly, which means that the performance of a well-written agent-based application scales with the number of cores or processors. ### 11.3.1 The components of an agent Figure 11.4 shows the fundamental component parts of an agent: * *Mailbox*—An internal queue to buffer incoming messages implemented as asynchronous, race-free, and non-blocking. * *Behavior*—The internal function applied sequentially to each incoming message. The behavior is single-threaded. * *State*—Agents can have an internal state that’s isolated and never shared, so they never need to compete for locks to be accessed. * *Message*—Agents can communicate only through messages, which are sent asynchronously and are buffered in a mailbox.  Figure 11.4 An agent consists of a mailbox that queues the income messages, a state, and a behavior that runs in a loop, which processes one message at a time. The behavior is the functionality applied to the messages. ### 11.3.2 What an agent can do The agent programming model provides great support for concurrency and has an extensive range of applicability. Agents are used in data collection and mining, reducing application bottlenecks by buffering requests, real-time analysis with bounded and unbounded reactive streaming, general-purpose number crunching, machine learning, simulation, Master/Worker pattern, Compute Grid, MapReduce, gaming, and audio and video processing, to mention a few. ### 11.3.3 The share-nothing approach for lock-free concurrent programming The share-nothing architecture refers to message-passing programming, where each agent is independent and there’s no single point of contention across the system. This architecture model is great for building concurrent and safe systems. If you don’t share anything, then there’s no opportunity for race conditions. Isolated message-passing blocks (agents) are a powerful and efficient technique to implement scalable programming algorithms, including scalable request servers and scalable distributed-programming algorithms. The simplicity and intuitive behavior of the agent as a building block allows for designing and implementing elegant, highly efficient asynchronous and parallel applications that don’t share state. In general, agents perform calculations in reaction to the messages they receive, and they can send messages to other agents in a fire-and-forget manner or collect the responses, called *replies* (figure 11.5).  Figure 11.5 Agents communicate with each other through a message-passing semantic, creating an interconnected system of units of computation that run concurrently. Each agent has an isolated state and independent behavior. ### 11.3.4 How is agent-based programming functional? Certain aspects of agent-based programming aren’t functional. Although agents (and actors) were developed in the context of functional languages, their purpose is to generate side effects, which is against the tenets of FP. An agent often performs a side effect, or sends a message to another agent, which will, in turn, perform a new side effect. Less important, but worth mentioning, is that FP in general separates logic from data. But agents contain data and the logic for the processing function. Additionally, sending a message to an agent doesn’t force any constraint on the return type. An agent behavior, which is the operation applied to each message, can either return a result or not return any result. In the latter scenario, the design of a message sent in a fire-and-forget fashion encourages program agents in a unidirectional flow pattern, which means that the messages flow forward from one agent to the next. This unidirectional message flow between agents can preserve their compositional semantic aspect, achieved by linking a given set of agents. The result is a pipeline of agents that represents the steps of operations to process the messages, each executed independently and potentially in parallel. The primary reason that the agent model is functional is that agents can *send behavior to the state instead of sending state to the behavior*. In the agent model, the sender, besides sending messages, can provide the function which implements an action to process the incoming messages. Agents are an in-memory slot where you can put in data structure, such as a bucket (container). In addition to providing data storage, agents allow you to send messages in the shape of a function, which is then applied atomically to the internal bucket. The function can be composed from other functions and then sent to the agent as a message. The advantage is the ability to update and change behavior at runtime using functions and function-composition fitting with the functional paradigm. ### 11.3.5 Agent is object-oriented It’s interesting to note that Alan Kay’s ([`en.wikipedia.org/wiki/Alan_Kay`](https://en.wikipedia.org/wiki/Alan_Kay)) original vision for objects in Smalltalk is much closer to the agent model than it is to the objects found in most programming languages (the basic concept of “messaging,” for example). Kay believed that state changes should be encapsulated and not done in an unconstrained way. His idea of passing messages between objects is intuitive and helps to clarify the boundaries between objects. Clearly, message passing resembles OOP, and you can lean on the OOP-style message passing, which is only calling a method. Here, an agent is like an object in an object-oriented program, because it encapsulates state and communicates with other agents by exchanging messages. ## 11.4 The F# agent: MailboxProcessor The support for the APM in F# doesn’t stop with asynchronous workflows (introduced in chapter 9). Additional support is provided inherently by the F# programming language, including `MailboxProcessor`, a primitive type that behaves as a lightweight in-memory message-passing agent (see figure 11.6). `MailboxProcessor` works completely asynchronously, and provides a simple concurrent programming model that can deliver fast and reliable concurrent programs. I could write an entire book about `MailboxProcessor`, its multipurpose uses, and the flexibility that it provides for building a wide range of diverse applications. The benefits of using it include having a dedicated and isolated message queue combined with an asynchronous handler, which is used to throttle the message processing to automatically and transparently optimize the usage of the computer’s resources.  Figure 11.6 `MailboxProcessor` (agent) waits asynchronously for incoming messages in the `while` loop. The messages are strings representing the URL, which are applied to the internal behavior to download the related website. The following listing shows a simple code example using a `MailboxProcessor`, which receives an arbitrary URL to print the length of the website. Listing 11.1 Simple `MailboxProcessor` with a `while` loop **``` type Agent<'T> = MailboxProcessor<'T> let webClientAgent = Agent<string>.Start(fun inbox -> async { ① while true do let! message = inbox.Receive() ② use client = new WebClient() let uri = Uri message let! site = client.AsyncDownloadString(uri) ③ printfn "Size of %s is %d" uri.Host site.Length }) agent.Post "http://www.google.com" ④ agent.Post "http://www.microsoft.com” ④ ``` Let’s look at how to construct an agent in F#. First, there must be a name of the instance. In this case `webClientAgent` is the address of the mailbox processor. This is how you’ll post a message to be processed. The `MailboxProcessor` is generally initialized with the `MailboxProcessor.Start` shortcut method, though you can create an instance by invoking the constructor directly, and then run the agent using the instance method `Start`. To simplify the name and use of the `MailboxProcessor`, you establish it as the alias agent and then start the agent with `Agent.Start`. Next, there’s a lambda function with an inbox containing an asynchronous workflow. Each message sent to the mailbox processor is sent asynchronously. The body of the agent functions as a message handler that accepts a mailbox (`inbox`:`MailboxProcessor`) as an argument. This mailbox has a running logical thread that controls a dedicated and encapsulated message queue, which is thread safe, to use and coordinate the communication with other threads, or agents. The mailbox runs asynchronously, using the F# asynchronous workflow. It can contain long-running operations that don’t block a thread. In general, messages need to be processed in order, so there must be a loop. This example uses a non-functional `while-true` style loop. It’s perfectly fine to use this or to use a functional, recursive loop. The agent in Listing 11.1 starts getting and processing messages by calling the asynchronous function `agent.Receive()` using the `let!` construct inside an imperative `while` loop. Inside the loop is the heart of the mailbox processor. The call of the mailbox `Receive` function waits for the incoming message without blocking the actual thread, and resumes once a message is received. The use of the `let!` operator ensures that the computation is started immediately. Then the first message available is removed from the mailbox queue and is bound to the message identifier. At this point, the agent reacts by processing the message, which in this example downloads and prints the size of a given website address. If the mailbox queue is empty and there are no messages to process, then the agent frees the thread back to the thread pool scheduler. That means no threads are idle while `Receive` waits for incoming messages, which are sent to the `MailboxProcessor` in a fire-and-forget fashion using the `agent.Post` method. ### 11.4.1 The mailbox asynchronous recursive loop In the previous example, the agent mailbox waits for messages asynchronously using an imperative `while` loop. Let’s modify the imperative loop so it uses a functional recursion to avoid mutation and possibly so it holds local state. The following listing is the same version of the agent that counts its messages (shown in Listing 11.1), but this time it uses a recursive asynchronous function that maintains a state. Listing 11.2 Simple `MailboxProcessor` with a recursive function ``` let agent = Agent<string>.Start(fun inbox -> let rec loop count = async { ① let! message = inbox.Receive() use client = new WebClient() let uri = Uri message let! site = client.AsyncDownloadString(uri) printfn "Size of %s is %d - total messages %d" uri.Host ➥ site.Length (count + 1) return! loop (count + 1) } ② loop 0) agent.Post "http://www.google.com" agent.Post "http://www.microsoft.com" ``` This functional approach is a little more advanced, but it greatly reduces the amount of explicit mutation in your code and is often more general. In fact, as you’ll see shortly, you can use the same strategy to maintain and safely reuse the state for caching. Pay close attention to the line of code for the `return! loop (n + 1)`, where the function uses asynchronous workflows recursively to execute the loop, passing the increased value of the count. The call using `return!` is tail-recursive, which means that the compiler translates the recursion more efficiently to avoid stack overflow exceptions. See chapter 3 for more details about recursive function support (also in C#). ## 11.5 Avoiding database bottlenecks with F# MailboxProcessor The core feature of most applications is database access, which is frequently the real source of bottlenecks in code. A simple database performance tuning can speed up applications significantly and keep the server responsive. How do you guarantee consistently high-throughput database access? To better facilitate database access, the operation should be asynchronous, because of the I/O nature of database access. Asynchronicity ensures that the server can handle multiple requests in parallel. You may wonder about the number of parallel requests that a database server can handle before performance degrades (figure 11.7 shows performance degradation at a high level). No exact answer exists. It depends on many different factors: for example, the size of the database connection pool. A critical element of the bottleneck problem is controlling and throttling the incoming requests to maximize the application’s performance. `MailboxProcessor` provides a solution by buffering the incoming messages and taming possible overflow of requests (see figure 11.8). Using `MailboxProcessor` as a mechanism to throttle the database operations provides a granular control for optimizing the database connection-pool use. For example, the program could add or remove agents to execute the database operations in a precise grade of parallelism.  Figure 11.7 A large number of concurrent requests to access the database are reduced due to the limited size of the connection pool.  Figure 11.8 The agent (`MailboxProcessor`) controls the incoming requests to optimize the database connection-pool use. Listing 11.3 shows a fully asynchronous function in F#. This function queries a given database and encapsulates the query in a `MailboxProcessor` body. Encapsulating an operation as behavior of an agent assures only one database request at a time is processed. To access the database, use the traditional .NET Access-Data-Object (ADO). Alternatively, you could use Microsoft Entity Framework or any other data access you choose. I don’t cover how to access the Entity Framework data access component in this book. For more detail, refer to the MSDN online documentation at [`mng.bz/4sdU`](http://mng.bz/4sdU). Listing 11.3 Using `MailboxProcessor` to manage database calls ``` type Person = { id:int; firstName:string; lastName:string; age:int } ① type SqlMessage = | Command of id:int * AsyncReplyChannel<Person option> ② let agentSql connectionString = fun (inbox: MailboxProcessor<SqlMessage>) -> let rec loop() = async { let! Command(id, reply) = inbox.Receive() ③ use conn = new SqlConnection(connectionString) use cmd = new SqlCommand("Select FirstName, LastName, Age ➥ from db.People where id = @id") cmd.Connection <- conn cmd.CommandType <- CommandType.Text cmd.Parameters.Add("@id", SqlDbType.Int).Value <- id if conn.State <> ConnectionState.Open then do! conn.OpenAsync() ④ use! reader = cmd.ExecuteReaderAsync( ⑤ CommandBehavior.SingleResult ||| CommandBehavior.CloseConnection) let! canRead = (reader:SqlDataReader).ReadAsync() if canRead then let person = { id = reader.GetInt32(0) firstName = reader.GetString(1) lastName = reader.GetString(2) age = reader.GetInt32(3) } reply.Reply(Some person) ⑥ else reply.Reply(None) ⑦ return! loop() } loop() type AgentSql(connectionString:string) = let agentSql = new MailboxProcessor<SqlMessage> (agentSql connectionString) member this.ExecuteAsync (id:int) = agentSql.PostAndAsyncReply(fun ch -> Command(id, ch)) ⑧ member this.ExecuteTask (id:int) = agentSql.PostAndAsyncReply(fun ch -> Command(id, ch)) |> Async.StartAsTask ⑧ ``` Initially, the `Person` data structure is defined as a record type, which can be consumed easily as an immutable class by any .NET programming language. The function `agentSql` defines the body of a `MailboxProcessor`, whose behavior receives messages and performs database queries asynchronously. You make your application more robust by using an `Option` type for the `Person` value, which would otherwise be `null`. Doing so helps prevent `null` reference exceptions. The type `AgentSql` encapsulates the `MailboxProcessor`, which originated from running the function `agentSql`. The access of the underlying agent is exposed through the methods `ExecuteAsync` and `ExecuteTask`. The purpose of the `ExecuteTask` method is to encourage interoperability with C#. You can compile the `AgentSql` type into an F# library and distribute it as a reusable component. If you want to use the component from C#, then you should also provide methods that return a type `Task` or `Task<T>` for the F# functions that run an asynchronous workflow object (`Async<'T>`). How to interop between F# `Async` and .NET `Task` types is covered in appendix C. ### 11.5.1 The MailboxProcessor message type: discriminated unions The `type SqlMessage Command` is a single-case DU used to send a message to the `MailboxProcessor` with a well-defined type, which can be pattern-matched: ``` type SqlMessage = | Command of id:int * AsyncReplyChannel<Person option> ``` A common F# practice is to use a DU to define the different types of messages that a `MailboxProcessor` can receive and pattern match them to deconstruct and obtain the underlying data structure (for more on F#, see appendix B). Pattern matching over DUs gives a succinct way to process messages. A common pattern is to call `inbox.Receive()` or `inbox.TryReceive()` and follow that call with a match on the message contents. Using strongly typed messages makes it possible for the `MailboxProcessor` behavior to distinguish between different types of messages and to supply different handling codes associated with each type of message. ### 11.5.2 MailboxProcessor two-way communication In Listing 11.3, the underlying `MailboxProcessor` returns (replies) to the caller the result of the database query in the shape of a `Person` option type. This communication uses the `AsyncReplyChannel<'T>` type, which defines the mechanism used to reply to the channel parameter established during message initialization (figure 11.9).  Figure 11.9 The agent two-way communication generates an `AsyncReplyChannel`, which is used by the agent as a callback to notify the caller when the computation is completed, generally supplying a result. The code that can wait asynchronously for a response uses the `AsyncReplyChannel`. Once the computation is complete, use the `Reply` function to return the results from the mailbox: ``` type SqlMessage = | Command of id:int * **AsyncReplyChannel**<Person option> member this.ExecuteAsync (id:int) = agentSql.PostAndAsyncReply(fun **ch** -> Command(id, **ch**)) ``` The `PostAndAsyncReply` method initializes the channel for the `Reply` logic, which hands off the reply channel to the agent as part of the message using an anonymous lambda (function). At this point, the workflow is suspended (without blocking) until the operation completes and a `Reply`*,* carrying the result, is sent back to the caller by the agent through the channel: ``` reply.Reply(Some person) ``` As good practice, you should embed the `AsyncReplyChannel` handler inside the message itself, as shown in the DU `SqlMessage.Command of id:int *` `AsyncReplyChannel<Person option>`, because the reply of the sent message can be easily enforced by the compiler. You might be thinking: Why would you use a `MailboxProcessor` to handle multiple requests if only one message at a time can be processed? Are the incoming messages lost if the `MailboxProcessor` is busy? Sending messages to a `MailboxProcessor` is always non-blocking; but from the agent’s perspective, receiving them is a blocking operation. Even if you’re posting multiple messages to the agent, none of the messages will get lost, because they’re buffered and inserted into the mailbox queue. It’s also possible to implement selective receive semantics to target and *scan (*[`mng.bz/1lJr`](http://mng.bz/1lJr)) for exact message types, and, depending on the agent behavior, the handler can wait for a specific message in the mailbox and temporarily defer others. This is a technique used to implement a finite-state machine with pause-and-resume capabilities. ### 11.5.3 Consuming the AgentSQL from C# At this point, you want to employ the `AgentSql` so it can be consumed by other languages. The exposed APIs are both C# `Task` and F# asynchronous workflow friendly. Using C#, it’s simple to employ `AgentSql`. After referencing the F# library containing the `AgentSql`, you can create an instance of the object and then call the `ExecuteTask` method: ``` AgentSql agentSql = new AgentSql("<< `*ConnectionString Here*` >>"); Person person = **await** agentSql.**ExecuteTask**(42); Console.WriteLine($"Fullname {person.FirstName} {person.LastName}"); ``` `ExecuteTask` reruns a `Task<Person>`, so you can use the C# async/await model to extract the underlying value when the operation completes as a continuation. You can use a similar approach in F#, an approach that supports the task-based programming model, although due to the intrinsic and superior support for the async workflow, I recommend that you use the `ExecuteAsync` method. In this case, you can either call the method inside an async computation expression, or call it by using the `Async.StartWithContinuations` function. With this function, a continuation handler can continue the work when the `AgentSql` replies with the result (see chapter 9). The following listing is an example using both F# approaches (the code to note is in bold). Listing 11.4 Interacting asynchronously with `AgentSql` ``` let token = CancellationToken() ① let agentSql = AgentSql("< Connection String Here >") let printPersonName id = **async** { let**!** (Some person) = agentSql.**ExecuteAsync** id ② printfn "Fullname %s %s" person.firstName person.lastName } Async.Start(printPersonName 42, token) ③ Async.**StartWithContinuations**(agentSql.ExecuteAsync 42, ④ (fun (Some person) -> printfn "Fullname %s %s" person.firstName person.lastName), ⑤ (fun exn -> printfn "Error: %s" exn.Message), ⑤ (fun cnl -> printfn "Operation cancelled"), token) ⑤ ``` **The `Async.StartWithContinuations` function specifies the code to run when the job completes as a continuation. `Async.StartWithContinuations` accepts three different continuation functions that are triggered with the output of the operation: * The code to run when the operation completes successfully, and a result is available. * The code to run when an exception occurs. * The code to run when an operation is canceled. The cancellation token is passed as an optional argument when you start the job. See chapter 9 or the MSDN documentation online for more information ([`mng.bz/teA8`](http://mng.bz/teA8)). `Async.StartWithContinuations` isn’t complicated, and it provides a convenient control over dispatching behaviors in the case of success, error, or cancellation. These functions passed are referred to as *continuation functions*. Continuation functions can be specified as a lambda expression in the arguments to `Async.StartWithContinuations`. Specifying code to run as a simple lambda expression is extremely powerful. ### 11.5.4 Parallelizing the workflow with group coordination of agents The main reason to have an agent process the messages to access a database is to control the throughput and to properly optimize the use of the connection pool. How can you achieve this fine control of parallelism? How can a system perform multiple requests in parallel without encountering a decrease in performance? `MailboxProcessor` is a primitive type that’s flexible for building reusable components by encapsulating behavior and then exposing general or tailored interfaces that fit your program needs. Listing 11.5 shows a reusable component, `parallelWorker` (in bold), that spawns a set of agents from a given count (`workers`). Here, each agent implements the same behavior and processes the incoming requests in a round-robin fashion. *Round-robin* is an algorithm that, in this case, is employed by the agent mailbox queue to process the incoming messages as first-come first-served, in circular order, handling all processes without particular priority. Listing 11.5 Parallel `MailboxProcessor` workers ``` type MailboxProcessor<'a> with static member public **parallelWorker** (workers:int) ① (behavior:MailboxProcessor<'a> -> Async<unit>) ② (?errorHandler:exn -> unit) (?cts:CancellationToken) = let cts = defaultArg cts (CancellationToken()) ③ let errorHandler = defaultArg errorHandler ignore ③ let agent = new MailboxProcessor<'a>((fun inbox -> let agents = Array.init workers (fun _ -> ④ let child = MailboxProcessor.Start(behavior, cts) child.Error.Subscribe(errorHandler) ⑤ child) cts.Register(fun () -> agents |> Array.iter( ⑥ fun a -> (a :> IDisposable).Dispose())) let rec loop i = async { let! msg = inbox.Receive() agents.[i].Post(msg) ⑦ return! loop((i+1) % workers) } loop 0), cts) agent.Start() ``` The main agent (`agentCoordinator`) initializes a collection of sub-agents to coordinate the work and to provide access to the agent’s children through itself. When the parent agent receives a message sent to the `parallelWorker` `MailboxProcessor`, the parent agent dispatches the message to the next available agent child (figure 11.10).  Figure 11.10 The parallel worker agent receives the messages that are sent to the children’s agents in a round-robin fashion to compute the work in parallel. The `parallelWorker` function uses a feature called *type extensions* ([`mng.bz/Z5q9`](http://mng.bz/Z5q9)) to attach a behavior to the `MailboxProcessor` type. The type extension is similar to an extension method. With this type extension, you can call the `parallelWorker` function using dot notation; as a result, the `parallelWorker` function can be used and called by any other .NET programming language, keeping its implementation hidden. The arguments of this function are as follows: * `workers`—The number of parallel agents to initialize. * `behavior`—The function to identically implement the underlying agents. * `errorHandler`—The function that each child agent subscribes to, to handle eventual errors. This is an optional argument and can be omitted. In this case, an ignore function is passed. * `cts`—A cancellation token used to stop and dispose of all the children’s agents. If a cancellation token isn’t passed as an argument, a default is initialized and passed into the agent constructor. ### 11.5.5 How to handle errors with F# MailboxProcessor Internally, the `parallelWorker` function creates an instance of the `MailboxProcessor` agent, which is the parent coordinator of the agent’s array (children), equaling in number the value of the `workers` argument: ``` let agents = Array.init workers (fun _ -> let child = MailboxProcessor.Start(behavior, cts) child.**Error.Subscribe**(errorHandler) child) ``` During the initialization phase, each agent child subscribes to its error event using the function `errorHandler`. In the case of an exception thrown from the body of a `MailboxProcessor`, the error event triggers and applies the function subscribed. Detecting and notifying the system in case of errors is essential in agent-based programming because it applies logic to react accordingly. The `MailboxProcessor` has built-in functionality for detecting and forwarding errors. When an uncaught error occurs in a `MailboxProcessor` agent, the agent raises the error*event:* *``` let child = MailboxProcessor.Start(behavior, cts) child.Error.Subscribe(errorHandler) ``` To manage the error, you can register a callback function to the event handler. It’s common practice to forward the errors to a supervising agent. For example, here a simple supervisor agent displays the error received: ``` let supervisor = Agent<System.Exception>.Start(fun inbox -> async { while true do let! err = inbox.Receive() printfn "an error occurred in an agent: %A" err }) ``` You can define the error handler function that’s passed as an argument to initialize all the agent children: ``` let handler = fun error -> supervisor.Post error let agents = Array.init workers (fun _ -> let child = MailboxProcessor.Start(behavior, cts) child.Error.Subscribe(errorHandler) child) ``` In critical application components, such as server-side requests represented as agents, you should plan to use the `MailboxProcessor` to handle errors gracefully and restart the application appropriately. To facilitate error handling by notifying a supervisor agent, it’s convenient to define a helper function: ``` module Agent = let **withSupervisor** (supervisor: Agent<exn>) (agent: Agent<_>) = agent.Error.Subscribe(fun error -> **supervisor**.Post error); agent ``` `withSupervisor` abstracts the registration for error handling in a reusable component. Using this helper function, you can rewrite the previous portion of code that registers error handling for the `parallelWorker`, as shown here: ``` let **supervisor** = Agent<System.Exception>.Start(fun inbox -> async { while true do let! error = inbox.Receive() **errorHandler** error }) let agent = new MailboxProcessor<'a>((fun inbox -> let agents = Array.init workers (fun _ -> MailboxProcessor.Start(behavior) |> **withSupervisor** **supervisor**) ``` The `parallelWorker` encapsulates the agent supervisor, which uses the `errorHandler` function as constructor behavior to handle the error messages from the children’s agent. ### 11.5.6 Stopping MailboxProcessor agents—CancellationToken To instantiate the children’s agent, use the `MailboxProcessor` constructor that takes a function parameter as a behavior of the agent, and takes as a second argument a `CancellationToken` object. `CancellationToken` registers a function to dispose and stop all the agents running. This function is executed when `CancellationToken` is canceled: ``` cts.Register(fun () -> agents |> Array.iter(fun a -> (a :> IDisposable).Dispose())) ``` Each child in the `MailboxProcessor` part of the `parallelWorker` agent, when running, is represented by an asynchronous operation associated with a given `CancellationToken`. Cancellation tokens are convenient when there are multiple agents that depend on each other, and you want to cancel all of them at once, similar to our example. A further implementation is to encapsulate the `MailboxProcessor` agent into a disposable: ``` type AgentDisposable<'T>(f:MailboxProcessor<'T> -> Async<unit>, ?cancelToken:CancellationTokenSource) = let cancelToken = defaultArg cancelToken (new CancellationTokenSource()) let agent = MailboxProcessor.Start(f, cancelToken.Token) member x.Agent = agent interface **IDisposable** with member x.Dispose() = **(agent :> IDisposable).Dispose()** **cancelToken**.Cancel()) ``` In this way, the `AgentDisposable` facilitates the cancellation and the memory deallocation (`Dispose`) of the underlying `MailboxProcessor` by calling the `Dispose` method from the `IDisposable` interface. Using the `AgentDisposable`, you can rewrite the previous portion of code that registers the cancellation of the children’s agent for `parallelWorker`: ``` let agents = Array.init workers (fun _ -> new **AgentDisposable**<'a>(behavior, cancelToken) |> withSupervisor supervisor) thisletCancelToken.Register(fun () -> agents |> Array.iter(fun agent -> agent.**Dispose**()) ``` When the cancellation token `thisletCancelToken` is triggered, the `Dispose` method of all the children’s agents is called, causing them to stop. You can find the full implementation of the refactored `parallelWorker` in this book’s source code. ### 11.5.7 Distributing the work with MailboxProcessor The rest of the code is self-explanatory. When a message is posted to the `parallelWorker`, the parent agent picks it up and forwards it to the first agent in line. The parent agent uses a recursive loop to maintain the state of the last agent served by index. During each iteration, the index is increased to deliver the following available message to the next agent: ``` let rec loop i = async { let! msg = inbox.Receive() **agents.[i].Post(msg)** return! loop((i+1) % workers) } ``` You can use the `parallelWorker` component in a wide range of cases. For the previous `AgentSql` code example, you applied the `parallelWorker` extension to reach the original goal of having control (management) over the number of parallel requests that can access the database server to optimize connection-pool consumption. Listing 11.6 Using `parallelWorker` to parallelize database reads ``` let connectionString = ① ConfigurationManager.ConnectionStrings.["DbConnection"].ConnectionString let maxOpenConnection = 10 ② let agentParallelRequests = MailboxProcessor<SqlMessage>.**parallelWorker**(maxOpenConnection, ③ agentSql connectionString) let fetchPeopleAsync (ids:int list) = let asyncOperation = ④ ids |> Seq.map (fun id -> agentParallelRequests.PostAndAsyncReply( fun ch -> Command(id, ch))) |> Async.**Parallel** Async.**StartWithContinuations**(asyncOperation, (fun people -> people |> Array.choose id |> Array.iter(fun person -> printfn "Fullname %s %s" person.firstName person.lastName)), (fun exn -> printfn "Error: %s" exn.Message), (fun cnl -> printfn "Operation cancelled")) ``` In this example the maximum number of open connections is arbitrary, but in a real case, this value varies. In this code, you first create the `MailboxProcessor` `agentParallelRequests`, which runs in parallel with the `maxOpenConnection` number of agents. The function `fetchPeopleAsync` is the final piece to glue together all the parts. The argument passed into this function is a list of people IDs to fetch from the database. Internally, the function applies the `agentParallelRequests` agent for each of the IDs to generate a collection of asynchronous operations that will run in parallel using the `Async.Parallel` function. In the example, the people IDs are retrieved in parallel; a more efficient way is to create an `SqlCommand` that fetches the data in one database round trip. But the purpose of the example still stands. The level of parallelism is controlled by the number of agents. This is an effective technique. In this book’s source code, you can find a complete and enhanced production-ready `parallelWorker` component that you can reuse in your daily work. ### 11.5.8 Caching operations with an agent In the previous section, you used the F# `MailboxProcessor` to implement a performant and asynchronous database access agent, which could control the throughput of parallel operations. To take this a step further to improve the response time (speed) for the incoming requests, you can reduce the actual number of database queries. This is possible with the introduction of a database cache in your program. There’s no reason why a single query should be executed more than once per request if the result won’t change. By applying smart caching strategies in database access, you can unlock a significant increase in performance. Let’s implement an agent-based reusable cache component, which then can be linked to the `agentParallelRequests` agent. The cache agent’s objective is to isolate and store the state of the application while handling the messages to read or update this state. This listing shows the implementation of the `MailboxProcessor` `CacheAgent`. Listing 11.7 Cache agent using the `MailboxProcessor` ``` type CacheMessage<'Key> = | GetOrSet of 'Key * AsyncReplyChannel<obj> | UpdateFactory of Func<'Key,obj> | Clear ① type Cache<'Key when 'Key : comparison> (factory : Func<'Key, obj>, ?timeToLive : int) = ② let timeToLive = defaultArg timeToLive 1000 let expiry = TimeSpan.FromMilliseconds (float timeToLive) ③ let cacheAgent = Agent.Start(fun inbox -> let cache = Dictionary<'Key, (obj * DateTime)>( ➥ HashIdentity.Structural) ④ let rec loop (factory:Func<'Key, obj>) = async { let! msg = inbox.TryReceive timeToLive ⑤ match msg with | Some (GetOrSet (key, channel)) -> match cache.TryGetValue(key) with ⑥ | true, (v,dt) when DateTime.Now - dt < expiry -> ⑥ channel.Reply v return! loop factory | _ -> let value = factory.Invoke(key) ⑥ channel.Reply value cache.Add(key, (value, DateTime.Now)) return! loop factory | Some(UpdateFactory newFactory) -> ⑦ return! loop (newFactory) | Some(Clear) -> cache.Clear() return! loop factory | None -> cache |> Seq.filter(function KeyValue(k,(_, dt)) -> DateTime.Now - dt > expiry) |> Seq.iter(function KeyValue(k, _) -> cache.Remove(k)|> ignore) return! loop factory } loop factory ) member this.TryGet<'a>(key : 'Key) = async { let! item = cacheAgent.PostAndAsyncReply( ⑧ fun channel -> GetOrSet(key, channel)) match item with | :? 'a as v -> return Some v | _ -> return None } member this.GetOrSetTask (key : 'Key) = cacheAgent.PostAndAsyncReply(fun channel -> GetOrSet(key, channel)) |> Async.StartAsTask ⑨ member this.UpdateFactory(factory:Func<'Key, obj>) = cacheAgent.Post(UpdateFactory(factory)) ⑦ ``` In this example, the first type, `CacheMessage`, is the definition of the message that is sent to the `MailboxProcessor`*in the form DUs. This DU determines the valid messages to send to the cache agent.* *The core of the `CacheAgent` implementation is to initialize and immediately start a `MailboxProcessor` that constantly watches for incoming messages. The constructs of F# make it easy to use lexical scoping to achieve isolation within asynchronous agents. This agent code uses the standard and mutable .NET dictionary collection to maintain the state originated from the different messages sent to an agent: ``` let cache = Dictionary<'Key, (obj * DateTime)>() ``` The internal dictionary is lexically private to the asynchronous agent, and no ability to read/write to the dictionary is made available other than to the agent. The mutable state in the dictionary is isolated. The agent function is defined as a recursive function loop that takes a single parameter factory, as shown here: ``` Agent.Start(fun inbox -> let rec loop (factory:Func<'Key, obj>) = async { ... } ``` The factory function represents the initialization policy to create and add an item when it isn’t found by the `cacheAgent` in the local state cache. This factory function is continuously passed into the recursive function loop for state management, which allows you to swap the initialization procedure at runtime. In the case of caching the `AgentSql` requests, if the database or the system goes offline, then the response strategy can change. This is easily achieved by sending a message to the agent. The agent receives the message semantic of the `MailboxProcessor`, which has a timeout to specify the expiration time. This is particularly useful for caching components to provoke a data invalidation, and then a data refresh: ``` let! msg = inbox.TryReceive timeToLive ``` The `TryReceive` of the inbox function returns a message option type, which can be either `Some`, when a message is received before the time `timeToLive` elapses, or `None` when no message is received during the `timeToLive` time: ``` | None -> cache |> Seq.filter(function KeyValue(k,(_, dt)) -> DateTime.Now - dt > expiry) |> Seq.iter(function KeyValue(k, _) -> cache.Remove(k) |> ignore) ``` In this case, when the timeout expires, the agent auto-refreshes the cached data by automatically invalidating (removing) all the cache items that expired. But if a message is received, the agent uses pattern matching to determine the message type so that the appropriate processing can be done. Here’s the range of capabilities for incoming messages: * `GetOrSet`—In this case, the agent searches the cache dictionary for an entry that contains the specified key. If the agent finds the key and the invalidation time isn’t expired, then it returns the associated value. Otherwise, if the agent doesn’t find the key or the invalidation time is expired, then it applies the factory function to generate a new value, which is stored into the local cache in combination with the timestamp of its creation. The timestamp is used by the agent to verify the expiration time. Finally, the agent returns the result to the sender of the message: ``` | Some (GetOrSet (key, channel)) -> match cache.TryGetValue(key) with | true, (v,dt) when DateTime.Now - dt < expiry -> channel.Reply v return! loop factory | _ -> let value = factory.Invoke(key) channel.Reply value cache.Add(key, (value, DateTime.Now)) return! loop factory ``` * `UpdateFactory`—This message type, as already explained, allows the handler to swap the runtime initialization policy for the cache item: ``` | Some(UpdateFactory newFactory) -> return! loop (newFactory) ``` * `Clear`—This message type clears the cache to reload all items. Ultimately, here’s the code that links the previous parallel `AgentSql` `agentParallelRequests` to the `CacheAgent`: ``` let connectionString = ConfigurationManager.ConnectionStrings.["DbConnection"].ConnectionString let **agentParallelRequests** = MailboxProcessor<SqlMessage>.**parallelWorker**(8, agentSql connectionString) let cacheAgentSql = let ttl = 60000 CacheAgent<int>(fun id -> **agentParallelRequests**.PostAndAsyncReply(fun ch -> Command(id, ch)), ttl) let person = cacheAgentSql.**TryGet**<Person> 42 ``` When the `cacheAgentSql` agent receives the request, it checks whether the value 42 exists in the cache and if it’s expired. Otherwise, it interrogates the underlying `parallelWorker` to return the expected item and save it into the cache to speed up future requests (see figure 11.11).  Figure 11.11 The `CacheAgent` maintains a local cache composed of key/value pairs, which associate an input (from a request) to a value. When a request arrives, the `CacheAgent` verifies the existence of the input/key and then either returns the value (if the input/key already exists in the local cache) without running any computation, or it calculates the value to send to the caller. In the latter case, the value is also persisted in the local cache to avoid repeated computation for the same inputs. ### 11.5.9 Reporting results from a MailboxProcessor Sometimes, the `MailboxProcessor` needs to report a state change to the system, where a subscribed component is to handle the state change. For example, for the `CacheAgent` example to be more complete, you could extend it to include such features as notification when data changes or when there’s a cache removal. But how does a `MailboxProcessor` report notifications to the outside system? This is accomplished by using events (Listing 11.8). You’ve already seen how the `MailboxProcessor`*reports when an internal error occurs by triggering a notification to all of its subscribers. You can apply the same design to report any other arbitrary events from the agent. Using the previous `CacheAgent`, let’s implement an event reporting that can be used to notify when data invalidation occurs. For the example, you’ll modify the agent for an auto-refresh that can be used to notify when data has changed (the code to note is in bold).* *Listing 11.8 Cache with event notification for refreshed items ``` type Cache<'Key when 'Key : comparison> (factory : Func<'Key, obj>, ?timeToLive : int, ?synchContext:SynchronizationContext) = let timeToLive = defaultArg timeToLive 1000 let expiry = TimeSpan.FromMilliseconds (float timeToLive) let cacheItemRefreshed = Event<('Key * 'obj)[]>() ① let **reportBatch** items = ② match synchContext with | None -> cacheItemRefreshed.Trigger(items) ③ | Some ctx -> ctx.Post((fun _ -> cacheItemRefreshed.Trigger(items)),null) ④ let cacheAgent = Agent.Start(fun inbox -> let cache = Dictionary<'Key, (obj * ➥ DateTime)>(HashIdentity.Structural) let rec loop (factory:Func<'Key, obj>) = async { let! msg = inbox.TryReceive timeToLive match msg with | Some (GetOrSet (key, channel)) -> match cache.TryGetValue(key) with | true, (v,dt) when DateTime.Now - dt < expiry -> channel.Reply v return! loop factory | _ -> let value = factory.Invoke(key) channel.Reply value reportBatch ([| (key, value) |]) ⑤ cache.Add(key, (value, DateTime.Now)) return! loop factory | Some(UpdateFactory newFactory) -> return! loop (newFactory) | Some(Clear) -> cache.Clear() return! loop factory | None -> cache |> Seq.choose(function KeyValue(k,(_, dt)) -> if DateTime.Now - dt > expiry then let value, dt = factory.Invoke(k), DateTime.Now cache.[k] <- (value,dt) Some (k, value) else None) |> Seq.toArray |> reportBatch ⑤ } loop factory ) member this.TryGet<'a>(key : 'Key) = async { let! item = cacheAgent.PostAndAsyncReply( fun channel -> GetOrSet(key, channel)) match item with | :? 'a as v -> return Some v | _ -> return None } member this.**DataRefreshed** = cacheItemRefreshed.Publish ① member this.Clear() = cacheAgent.Post(Clear) ``` In this code, the event `cacheItemRefreshed` channel dispatches the changes of state. By default, F# events execute the handlers on the same thread on which they’re triggered. In this case, it uses the agent’s current thread. But depending on which thread originated the `MailboxProcessor`, the current thread can be either from the `threadPool` or coming from the UI thread, specifically from `SynchronizationContext`, a class from `System.Threading` that captures the current synchronization context. The latter might be useful when the notification is triggered in response to an event that targets to update the UI. This is the reason the agent constructor, in the example, has the new parameter `synchContext`, which is an option type that provides a convenient mechanism to control where the event is triggered. The `Some ctx` command means that the `SynchronizationContext`*isn’t `null`, and `ctx` is an arbitrary name given to access its value. When the synchronization context is `Some ctx`, the reporting mechanism uses the `Post`*method to notify the state changes on the thread selected by the synchronization context. The method signature of the synchronization context `ctx.Post` takes a delegate and an argument used by the delegate. Although the second argument isn’t required, `null` is used as replacement. The function `reportBatch` triggers the event `cacheItemRefreshed`:** **``` this.DataRefreshed.Add(printAgent.Post) ``` In the example, the change-of-state notification handler posts a message to a `MailboxProcessor` to print a report in a thread-safe manner. But you could use the same idea in more complex scenarios, such as for updating a web page automatically with the most recent data using `SignalR`. ### 11.5.10 Using the thread pool to report events from MailboxProcessor In most cases, to avoid unnecessary overhead, it is preferable to trigger an event using the current thread. Still, there may be circumstances where a different threading model could be better: for example, if triggering an event could block for a time or throw an exception that could kill the current process. A valid option is to trigger the event operating the thread pool to run the notification in a separate thread. The `reportBatch` function can be refactored using the F# asynchronous workflow and the `Async.Start` operator: ``` let reportBatch batch = async { batchEvent.Trigger(batch) } |> Async.Start ``` Be aware with this implementation, the code running on a thread pool cannot access UI elements. ## 11.6 F# MailboxProcessor: 10,000 agents for a game of life `MailboxProcessor`, combined with asynchronous workflows, is a lightweight unit of computation (primitives), compared to threads. Agents can be spawned and destroyed with minimal overhead. You can distribute the work to various `MailboxProcessor`s, similar to how you might use threads, without having to incur the overhead associated with spinning up a new thread. For this reason, it’s completely feasible to create applications that consist of hundreds of thousands of agents running in parallel with minimum impact to the computer resources. In this section, we use `MailboxProcessor` from multiple instances by implementing the Game of Life ([`en.wikipedia.org/wiki/Game_of_Life`](https://en.wikipedia.org/wiki/Game_of_Life)). As described on Wikipedia, Life, as it is simply known, is a cellular automaton. It is a zero-player game, which means that once the game starts with a random initial configuration, it runs without any other input. This game consists of a collection of cells that run on a grid, each cell following a few mathematical rules. Cells can live, die, or multiply. Every cell interacts with its eight neighbors (the adjacent cells). A new state of the grid needs to be continually calculated to move the cells around to respect these rules. These are the Game of Life rules: * Each cell with one or no neighbors dies, as if by solitude. * Each cell with four or more neighbors dies, as if by overpopulation. * Each cell with two or three neighbors survives. * Each cell with three neighbors becomes populated. Depending on the initial conditions, the cells form patterns throughout the course of the game. The rules are applied repeatedly to create further generations until the cells reach a stable state (figure 11.12).  Figure 11.12 When the Game of Life is set up, each cell (in the code example there are 100,000 cells) is constructed using an `AgentCell MailboxProcessor`. Each agent can be dead, a black circle, or alive depending the state of its neighbors. Listing 11.9 is the implementation of the Game of Life cell, `AgentCell`, which is based on the F# `MailboxProcessor`. Each agent cell communicates with the adjacent cells through asynchronous message passing, producing a fully parallelized Game of Life. For conciseness, and because they’re irrelevant for the main point of the example, I omitted a few parts of the code. You can find the full implementation in this book’s source code. Listing 11.9 Game of Life with `MailboxProcessor` as cells ``` type CellMessage = | NeighborState of cell:AgentCell * isalive:bool | State of cellstate:AgentCell | Neighbors of cells:AgentCell list | ResetCell ① and State = { neighbors:AgentCell list wasAlive:bool isAlive:bool } ② static member createDefault isAlive = { neighbors=[]; isAlive=isAlive; wasAlive=false; } and AgentCell(location, alive, updateAgent:Agent<_>) as this = let neighborStates = Dictionary<AgentCell, bool>() ③ let AgentCell = Agent<CellMessage>.Start(fun inbox -> let rec loop state = async { let! msg = inbox.Receive() match msg with | ResetCell -> state.neighbors |> Seq.iter(fun cell -> cell.Send(State(this))) ④ neighborStates.Clear() return! loop { state with wasAlive=state.isAlive } ⑤ | Neighbors(neighbors) -> return! loop { state with neighbors=neighbors } ⑤ | State(c) -> c.Send(NeighborState(this, state.wasAlive)) return! loop state | NeighborState(cell, alive) -> neighborStates.[cell] <- alive if neighborStates.Count = 8 then ⑥ let aliveState = let numberOfneighborAlive = neighborStates |> Seq.filter(fun (KeyValue(_,v)) -> v) ⑦ |> Seq.length match numberOfneighborAlive with ⑦ | a when a > 3 || a < 2 -> false | 3 -> true | _ -> state.isAlive updateAgent.Post(Update(aliveState, location)) ⑧ return! loop { state with isAlive = aliveState } else return! loop state } loop (State.createDefault alive )) member this.Send(msg) = AgentCell.Post msg ``` `AgentCell` represents a cell in the grid of the Game of Life. The main concept is that each agent communicates with the neighbor cells about its (the agent’s) current state using asynchronous message passing. This pattern creates a chain of interconnected parallel communications that involves all the cells, which send their updated state to the `updateAgent` `MailboxProcessor`. At this point, the `updateAgent` refreshes the graphic in the UI. Listing 11.10 `updateAgent` that refreshes the WPF UI in real time ``` let updateAgent grid (ctx: SynchronizationContext) = ① let gridProduct = grid.Width * grid.Height let pixels = Array.zeroCreate<byte> (gridProduct) ② Agent<UpdateView>.Start(fun inbox -> let gridState = Dictionary<Location, bool>(HashIdentity.Structural) ③ let rec loop () = async { let! msg = inbox.Receive() match msg with | Update(alive, location, agent) -> ④ agentStates.[location] <- alive ④ agent.Send(ResetCell) ④ if agentStates.Count = gridProduct then ⑤ agentStates.AsParallel().ForAll(fun s -> pixels.[s.Key.x+s.Key.y*grid.Width] <- if s.Value then 128uy else 0uy ⑥ ) do! Async.SwitchToContext ctx ⑦ image.Source <- createImage pixels ⑦ do! Async.SwitchToThreadPool() ⑦ agentStates.Clear() return! loop() } loop()) ``` `updateAgent`, as its name suggests, updates the state of each pixel with the correlated cell value received in the `Update` message. The agent maintains the status of the pixels and uses that status to create a new image when all the cells have sent their new state. Next, `updateAgent` refreshes the graphical WPF UI with this new image, which represents the current grid of the Game of Life: ``` do! Async.SwitchToContext ctx image.Source <- createImage pixels do! Async.SwitchToThreadPool() ``` It’s important to note that `updateAgent` agent uses the current synchronization context to update the WPF controller correctly. The current thread is switched to the UI thread using the `Async`.`SwitchToContext` function (discussed in chapter 9). The final piece of code to run the Game of Life generates a grid that acts as the playground for the cells, and then a timer notifies the cells to update themselves (Listing 11.11). In this example, the grid is a square of 100 cells per side, for a total of 10,000 cells (`MailboxProcessor`s) that*run in parallel with a refresh timer of 50 ms, as shown in figure 11.13. There are 10,000 `MailboxProcessor`s*communicating with each other and updating the UI 20 times every second (the code to note is in bold).** **Listing 11.11 Creating the Game of Life grid and starting the timer to refresh ``` let run(ctx:SynchronizationContext) = let size = 100 ① let grid = { Width= size; Height=size} ② let updateAgent = updateAgent grid ctx let **cells** = seq { for x = 0 to grid.Width - 1 do for y = 0 to grid.Height - 1 do ③ let agent = AgentCell({x=x;y=y}, alive=getRandomBool(), updateAgent=updateAgent) yield (x,y), agent } |> dict let neighbours (x', y') = seq { for x = x' - 1 to x' + 1 do for y = y' - 1 to y' + 1 do if x <> x' || y <> y' then yield cells.[(x + grid.Width) % grid.Width, (y + grid.Height) % grid.Height] } |> Seq.toList **cells.AsParallel().ForAll**(fun pair -> ④ let cell = pair.Value let neighbours = neighbours pair.Key **cell**.Send(Neighbors(neighbours)) ④ **cell**.Send(ResetCell) ④ ) ``` The notifications to all the `cells` (agents) are sent in parallel using PLINQ. The `cells` are an F# sequence that’s treated as a .NET `IEnumerable`*,* which allows an effortless integration of LINQ/PLINQ.  Figure 11.13 Game of Life. The GUI is a WPF application. When the code runs, the program generates 10,000 F# `MailboxProcessor`s in less than 1 ms with a memory consumption, specific for the agents, of less than 25 MB. Impressive! ## Summary * The agent programming model intrinsically promotes immutability and isolation for writing concurrent systems, so even complex systems are easier to reason about because the agents are encapsulated into active objects. * The Reactive Manifesto defines the properties to implement a reactive system, which is flexible, loosely coupled, and scalable. * Natural isolation is important for writing lockless concurrent code. In a multithreaded program, isolation solves the problem of shared state by giving each thread a copied portion of data to perform local computation. When using isolation, there’s no race condition. * By being asynchronous, agents are lightweight, because they don’t block threads while waiting for a message. As a result, you can use hundreds of thousands of agents in a single application without any impact on the memory footprint. * The F# `MailboxProcessor` allows two-way communication: the agent can use an asynchronous channel to return (reply) to the caller the result of a computation. * The agent programming model F# `MailboxProcessor` is a great tool for solving application bottleneck issues, such as multiple concurrent database accesses. In fact, you can use agents to speed up applications significantly and keep the server responsive. * Other .NET programming languages can consume the F# `MailboxProcessor` by exposing the methods using the friendly TPL task-based programming model.************ ******# 12 Parallel workflow and agent programming with TPL Dataflow **This chapter covers** * Using TPL Dataflow blocks * Constructing a highly concurrent workflow * Implementing a sophisticated Producer/Consumer pattern * Integrating Reactive Extensions with TPL Dataflow Today’s global market requires that businesses and industries be agile enough to respond to a constant flow of changing data. These workflows are frequently large, and sometimes infinite or unknown in size. Often, the data requires complex processing, leading to high throughput demands and potentially immense computational loads. To cope with these requirements, the key is to use parallelism to exploit system resources and multiple cores. But today’s .NET Framework’s concurrent programming models weren’t designed with dataflow in mind. When designing a reactive application, it’s fundamental to build and treat the system components as units of work. These units react to messages, which are propagated by other components in the chain of processing. These reactive models emphasize a push-based model for applications to work, rather than a pull-based model (see chapter 6). This push-based strategy ensures that the individual components are easy to test and link, and, most importantly, easy to understand. This new focus on push-based constructions is changing how programmers design applications. A single task can quickly grow complex, and even simple-looking requirements can lead to complicated code. In this chapter, you’ll learn how the .NET Task Parallel Library Dataflow (TPL Dataflow, or TDF) helps you to tackle the complexity of developing modern systems with an API that builds on TAP. TDF fully supports asynchronous processing, in combination with a powerful compositionality semantic and a better configuration mechanism than the TPL. TDF eases concurrent processing and implements tailored asynchronous parallel workflow and batch queuing. Furthermore, it facilitates the implementation of sophisticated patterns based on combining multiple components that talk to each other by passing messages. ## 12.1 The power of TPL Dataflow Let’s say you’re building a sophisticated Producer/Consumer pattern that must support multiple producers and/or multiple consumers in parallel, or perhaps it has to support workflows that can scale the different steps of the process independently. One solution is to exploit Microsoft TPL Dataflow. With the release of .NET 4.5, Microsoft introduced TPL Dataflow as part of the tool set for writing concurrent applications. TDF is designed with the higher-level constructs necessary to tackle easy parallel problems while providing a simple-to-use, powerful framework for building asynchronous data-processing pipelines. TDF isn’t distributed as part of the .NET 4.5 Framework, so to access its API and classes, you need to import the official Microsoft NuGet Package (`install-Package Microsoft.Tpl.DataFlow`). TDF offers a rich array of components (also called *blocks*) for composing dataflow and pipeline infrastructures based on the in-process message-passing semantic (see figure 12.1). This dataflow model promotes actor-based programming by providing in-process message passing for coarse-grained dataflow and pipelining tasks. TDF uses the task scheduler (`TaskScheduler`, [`mng.bz/4N8F`](http://mng.bz/4N8F)) of the TPL to efficiently manage the underlying threads and to support the TAP model (async/await) for optimized resource utilization. TDF increases the robustness of highly concurrent applications and obtains better performance for parallelizing CPU and I/O intensive operations, which have high throughput and low latency. * Figure 12.1 Workflow composed by multiple steps. Each operation can be treated as an independent computation. The concept behind the TPL Dataflow library is to ease the creation of multiple patterns, such as with batch-processing pipelines, parallel stream processing, data buffering, or joining and processing batch data from one or more sources. Each of these patterns can be used as a standalone, or may be composed with other patterns, enabling developers to easily express complex dataflow. ## 12.2 Designed to compose: TPL Dataflow blocks Imagine you’re implementing a complex workflow process composed of many different steps, such as a stock analysis pipeline. It’s ideal to split the computation in blocks, developing each block independently and then gluing them together. Making these blocks reusable and interchangeable enhances their convenience. This composable design would simplify the application of complex and convoluted systems. Compositionality is the main strength of TPL Dataflow, because its set of independent containers, known as blocks, is designed to be combined. These blocks can be a chain of different tasks to construct a parallel workflow, and are easily swapped, reordered, reused, or even removed. TDF emphasizes a component’s architectural approach to ease the restructure of the design. These dataflow components are useful when you have multiple operations that must communicate with one another asynchronously or when you want to process data as it becomes available, as shown in figure 12.2.  Figure 12.2 TDF embraces the concepts of reusable components. In this figure, each step of the workflow acts as reusable components. TDF brings a few core primitives that allow you to express computations based on Dataflow graphs. Here’s a high-level view of how TDF blocks operate: 1. Each block receives and buffers data from one or more sources, including other blocks, in the form of messages. When a message is received, the block reacts by applying its behavior to the input, which then can be transformed and/or used to perform side effects. 2. The output from the component (*block*) is then passed to the next linked block, and to the next one, if any, and so on, creating a pipeline structure. TDF excels at providing a set of configurable properties by which it’s possible, with small changes, to control the level of parallelism, manage the buffer size of the mailbox, and process data and dispatch the outputs. There are three main types of dataflow blocks: * *Source* —Operates as producer of data. It can also be read from. * *Target* —Acts as a consumer, which receives the data and can be written to. * *Propagator* —Acts as both a Source and a Target block. For each of these dataflow blocks, TDF provides a set of subblocks, each with a different purpose. It’s impossible to cover all the blocks in one chapter. In the following sections we focus on the most common and versatile ones to adopt in general pipeline composition applications. For more information about the Dataflow library, see the online MSDN documentation ([`mng.bz/GDbF`](http://mng.bz/GDbF)). ### 12.2.1 Using BufferBlock<TInput> as a FIFO buffer TDF `BufferBlock<T>` acts as an unbounded buffer for data that’s stored in a first in, first out (FIFO) order (figure 12.3). In general, `BufferBlock` is a great tool for enabling and implementing asynchronous Producer/Consumer patterns, where the internal message queue can be written to by multiple sources, or read from multiple targets.  Figure 12.3 The TDF `BufferBlock` has an internal buffer where messages are queued, waiting to be processed by the task. The input and output are the same types, and this block doesn’t apply any transformation on the data. Here is a simple Producer/Consumer using the TDF `BufferBlock`. Listing 12.1 Producer/Consumer based on the TDF `BufferBlock` ``` BufferBlock<int> buffer = new BufferBlock<int>(); ① async Task Producer(IEnumerable<int> values) { foreach (var value in values) buffer.Post(value); ② buffer.Complete(); ③ } async Task Consumer(Action<int> process) { while (await buffer.OutputAvailableAsync()) ④ process(await buffer.ReceiveAsync()); ⑤ } async Task Run() { IEnumerable<int> range = Enumerable.Range(0,100); await Task.WhenAll(Producer(range), Consumer(n => Console.WriteLine($"value {n}"))); } ``` The items of the `IEnumerable` values are sent through the `buffer.Post` method to the `BufferBlock` buffer, which retrieves them asynchronously using the `buffer.ReceiveAsync` method. The `OutputAvailableAsync` method knows when the next item is ready to be retrieved and makes the notification. This is important to protect the code from an exception; if the buffer tries to call the `Receive` method after the block completes processing, an error is thrown. This `BufferBlock` block essentially receives and stores data so that it can be dispatched to one or more other target blocks for processing. ### 12.2.2 Transforming data with TransformBlock<TInput, TOutput> The TDF `TransformBlock<TInput,TOutput>` acts like a mapping function, which applies a projection function to an input value and provides a correlated output (figure 12.4). The transformation function is passed as an argument in the form of a delegate `Func<TInput,TOutput>`, which is generally expressed as a lambda expression. This block’s default behavior is to process one message at a time, maintaining strict FIFO ordering.  Figure 12.4 The TDF `TransformBlock` has an internal buffer for both the input and output values; this type of block has the same buffer capabilities as `BufferBlock`. The purpose of this block is to apply a transformation function on the data; the `Input` and `Output` are likely different types. Note that `TransformBlock<TInput,TOutput>` performs as the `BufferBlock<TOutput>`, which buffers both the input and output values. The underlying delegate can run synchronously or asynchronously. The asynchronous version has a type signature `Func<TInput,Task<TOutput>>` whose purpose it is to run the underlying function asynchronously. The block treats the process of that element as completed when the returned `Task` appears terminated. This listing shows how to use the `TransformBlock` type (the code to note is in bold). Listing 12.2 Downloading images using the TDF `TransformBlock` ``` var fetchImageFlag = new **TransformBlock**<string, (string, byte[])>( **async** urlImage => { ① using (var webClient = new WebClient()) { byte[] data = **await** webClient.DownloadDataTaskAsync(urlImage); ② return (urlImage, data); ③ } }); List<string> urlFlags = new List<string>{ "Italy#/media/File:Flag_of_Italy.svg", "Spain#/media/File:Flag_of_Spain.svg", "United_States#/media/File:Flag_of_the_United_States.svg" }; foreach (var urlFlag in urlFlags) fetchImageFlag.Post($"https://en.wikipedia.org/wiki/{urlFlag}"); ``` In this example, the `TransformBlock<string,(string, byte[])>` `fetchImageFlag` block fetches the flag image in a tuple string and byte array format. In this case, the output isn’t consumed anywhere, so the code isn’t too useful. You need another block to process the outcome in a meaningful way. ### 12.2.3 Completing the work with ActionBlock<TInput > The TDF `ActionBlock` executes a given callback for any item sent to it. You can think of this block logically as a buffer for data combined with a task for processing that data. ``ActionBlock<TInput> is a target block that calls a delegate when it receives data, similar to a `for-each` loop (figure 12.5).`` ```` Figure 12.5 The TDF `ActionBlock` has an internal buffer for input messages that are queued if the task is busy processing another message. This type of block has the same buffer capabilities as `BufferBlock`. The purpose of this block is to apply an action that completes the workflow without output that likely produces side effects. In general, because `ActionBlock` doesn’t have an output, it cannot compose to a following block, so it’s used to terminate the workflow. `ActionBlock<TInput>` is usually the last step in a TDF pipeline; in fact, it doesn’t produce any output. This design prevents `ActionBlock` from being combined with further blocks, unless it posts or sends the data to another block, making it the perfect candidate to terminate the workflow process. For this reason, `ActionBlock` is likely to produce side effects as a final step to complete the pipeline processing. The following code shows the `TransformBlock` from the previous listing pushing its outputs to the `ActionBlock` to persist the flag images in the local filesystem (in bold). Listing 12.3 Persisting data using the TDF `ActionBlock` ``` var saveData = new **ActionBlock**<(string, byte[])>(**async** data => { ① (string urlImage, byte[] image) = data; ② string filePath = urlImage.Substring(urlImage.IndexOf("File:") + 5); **await** File.WriteAllBytesAsync(filePath, image); ③ }); fetchImageFlag.**LinkTo**(saveData); ④ ``` The argument passed into the constructor during the instantiation of the `ActionBlock` block can be either a delegate `Action<TInput>` or `Func<TInput,Task>``.` The latter performs the internal action (behavior) asynchronously for each message input (received). Note that the `ActionBlock` has an internal buffer for the incoming data to be processed, which works exactly like the `BufferBlock`. It’s important to remember that the `ActionBlock saveData` is linked to the previous `TransformBlock fetchImageFlag` using the `LinkTo` extension method. In this way, the output produced by the `TransformBlock` is pushed to the `ActionBlock` as soon as available. ### 12.2.4 Linking dataflow blocks TDF blocks can be linked with the help of the `LinkTo` extension method. Linking dataflow blocks is a powerful technique for automatically transmitting the result of each computation between the connected blocks in a message-passing manner. The key component for building sophisticated pipelines in a declarative manner is to use connecting blocks. If we look at the signature of the `LinkTo` extension method from the conceptual point of view, it looks like a function composition: ``` **LinkTo: (a -> b) -> (b -> c)** ``` ## 12.3 Implementing a sophisticated Producer/Consumer with TDF The TDF programming model can be seen as a sophisticated Producer/Consumer pattern, because the blocks encourage a pipeline model of programming, with producers sending messages to decoupled consumers. These messages are passed asynchronously, maximizing throughput. This design provides the benefits of not blocking the producers, because the TDF blocks (queue) act as a buffer, eliminating waiting time. The synchronization access between producer and consumers may sound like an abstract problem, but it’s a common task in concurrent programming. You can view it as a design pattern for synchronizing two components. ### 12.3.1 A multiple Producer/single Consumer pattern: TPL Dataflow The Producer/Consumer pattern is one of the most widely used patterns in parallel programming. Developers use it to isolate work to be executed from the processing of that work. In a typical Producer/Consumer pattern, at least two separated threads run concurrently: one produces and pushes the data to process into a queue, and the other verifies the presence of the new incoming piece of data and processes it. The queue that holds the tasks is shared among these threads, which requires care for accessing tasks safely. TDF is a great tool for implementing this pattern, because it has intrinsic support for multiple readers and multiple writers concurrently, and it encourages a pipeline pattern of programming with producers sending messages to decoupled consumers (figure 12.6).  Figure 12.6 Multiple-producers/one-consumer pattern using the TDF `BufferBlock`, which can manage and throttle the pressure of multiple producers In the case of a multiple-Producer/single-Consumer pattern, it’s important to enforce a restriction between the number of items generated and the number of items consumed. This constraint aims to balance the work between the producers when the consumer cannot handle the load. This technique is called *throttling*. Throttling protects the program from running out of memory if the producers are faster than the consumer. Fortunately, TDF has built-in support for throttling, which is achieved by setting the maximum size of the buffer through the property `BoundedCapacity`, part of the `DataFlowBlockOptions`. In Listing 12.4, this property ensures that there will never be more than 10 items in the `BufferBlock` queue. Also, in combination with enforcing the limit of the buffer size, it’s important to use the function `SendAsync`, which waits without blocking for the buffer to have available space to place a new item. Listing 12.4 Asynchronous Producer/Consumer using TDF ``` BufferBlock<int> buffer = new BufferBlock<int>( new DataFlowBlockOptions { BoundedCapacity = 10 }); ① async Task Produce(IEnumerable<int> values) { foreach (var value in values) await buffer.SendAsync(value);; ② } async Task MultipleProducers(params IEnumerable<int>[] producers) { await Task.WhenAll( from values in producers select Produce(values).ToArray()) ③ .ContinueWith(_ => buffer.Complete()); ④ } async Task Consumer(Action<int> process) { while (await buffer.OutputAvailableAsync()) ⑤ process(await buffer.ReceiveAsync()); } async Task Run() { IEnumerable<int> range = Enumerable.Range(0, 100); await Task.WhenAll(MultipleProducers(range, range, range), Consumer(n => Console.WriteLine($"value {n} - ThreadId {Thread.CurrentThread.ManagedThreadId}"))); } ``` By default, TDF blocks have the value `DataFlowBlockOptions.Unbounded` set to -1, which means that the queue is unbounded (unlimited) to the number of messages. But you can reset this value to a specific capacity that limits the number of messages the block may be queuing. When the queue reaches maximum capacity, any additional incoming messages will be postponed for later processing, making the producer wait before further work. Likely, making the producer slow down (or wait) isn’t a problem because the messages are sent asynchronously. ### 12.3.2 A single Producer/multiple Consumer pattern The TDF*`BufferBlock` intrinsically supports a single Producer/multiple Consumer pattern. This is handy if the producer performs faster than the multiple consumers, such as when they’re running intensive operations.* *Fortunately, this pattern is running on a multicore machine, so it can use multiple cores to spin up multiple processing blocks (consumers), each of which can handle the producers concurrently. Achieving the multiple-consumer behavior is a matter of configuration. To do so, set the `MaxDegreeOfParallelism` property to the number of parallel consumers to run. Here’s Listing 12.4 modified to apply a max-degree-of-parallelism set to the number of available logical processors: ``` BufferBlock<int> buffer = new BufferBlock<int>(new DataFlowBlockOptions { BoundedCapacity = 10, MaxDegreeOfParallelism = Environment.ProcessorCount }); ``` By default, the TDF block setting processes only one message at a time, while buffering the other incoming messages until the previous one completes. Each block is independent of others, so one block can process one item while another block processes a different item. But when constructing the block, you can change this behavior by setting the `MaxDegreeOfParallelism` property in the `DataFlowBlockOptions` to a value greater than 1\. You can use TDF to speed up the computations by specifying the number of messages that can be processed in parallel. The internals of the class handle the rest, including the ordering of the data sequence. ## 12.4 Enabling an agent model in C# using TPL Dataflow TDF blocks are designed to be stateless by default, which is perfectly fine for most scenarios. But there are situations in an application when it’s important to maintain a state: for example, a global counter, a centralized in-memory cache, or a shared database context for transactional operations. In such situations, there’s a high probability that the shared state is also the subject of mutation, because of continually tracking certain values. The problem has always been the difficulty of handling asynchronous computations combined with mutable state. As previously mentioned, the mutation of shared state becomes dangerous in a multithreaded environment by leading you into a tar pit of concurrent issues ([`curtclifton.net/papers/MoseleyMarks06a.pdf`](http://curtclifton.net/papers/MoseleyMarks06a.pdf)). Luckily, TDF encapsulates the state inside the blocks, while the channels between blocks are the only dependencies. By design, this permits isolated mutation in a safe manner. As demonstrated in chapter 11, the F# `MailboxProcessor` can solve these problems because it embraces the agent model philosophy, which can maintain an internal state by safeguarding its access to be concurrent safe (only one thread at a time can access the agent). Ultimately, the F# `MailboxProcessor` can expose a set of APIs to the C# code that can consume it effortlessly. Alternatively, you can reach the same performance using TDF to implement an agent object in C#, and then that agent object can act as the F#*`MailboxProcessor`.* *The implementation of `StatefulDataFlowAgent` relies on the instance of `actionBlock` to receive, buffer, and process incoming messages with an unbounded limit (figure 12.7). Note that the max degree of parallelism is set to the default value 1 as designed, embracing the single-threaded nature of the agent model. The state of the agent is initialized in the constructor and is maintained through a polymorphic and mutable value `TState`, which is reassigned as each message is processed. (Remember that the agent model only allows access by one thread at a time, ensuring that the messages are processed sequentially to eliminate any concurrent problems.) It’s good practice to use an immutable state, regardless of the safety provided by the agent implementation.  Figure 12.7 The stateful and stateless agents implemented using the TDF `ActionBlock`. The stateful agent has an internal isolated arbitrary value to maintain in memory a state that can change. The next listing shows the implementation of the `StatefulDataFlowAgent` class, which defines a stateful and generic agent that encapsulates the TDF `AgentBlock` to process and store type values (in bold). Listing 12.5 Stateful agent in C# using TDF ``` class StatefulDataFlowAgent<TState, TMessage> : **IAgent**<TMessage> { private TState state; private readonly ActionBlock<TMessage> actionBlock; public StatefulDataFlowAgent( TState initialState, Func<TState, TMessage, Task<TState>> action, ① CancellationTokenSource cts = null) { state = initialState; var options = new ExecutionDataFlowBlockOptions { CancellationToken = cts != null ? cts.Token : CancellationToken.None ② }; actionBlock = new ActionBlock<TMessage>( ③ **async** msg => state = **await** action(state, msg), options); } public Task Send(TMessage message) => actionBlock.SendAsync(message); public void Post(TMessage message) => actionBlock.Post(message); } ``` The `CancellationToken` can stop the agent at any time, and it’s the only optional parameter passed into the constructor. The function `Func<TState,TMessage, Task<TState>>` is applied to each message, in combination with the current state. When the operation completes, the current state is updated, and the agent moves to process the next available message. This function is expecting an asynchronous operation, which is recognizable from the return type of `Task<TState>.` The agent implements the inheritances from the interface `IAgent<TMessage>`, which defines the two members `Post` and `Send`, used to pass messages to the agent synchronously or asynchronously, respectively: ``` public interface IAgent<TMessage> { Task Send(TMessage message); void Post(TMessage message); } ``` Use the helper factory function `Start`, as in the F# `MailboxProcessor`, to initialize a new agent, represented by the implemented interface `IAgent<TMessage>` : ``` IAgent<TMessage> Start<TState, TMessage>(TState initialState, ➥ Func<TState, TMessage, Task<TState>> action, ➥ CancellationTokenSource cts = null) => new StatefulDataFlowAgent<TState, TMessage>(initialState, action, cts); ``` Because the interaction with the agent is only through sending (`Post` or `Send`) a message, the primary purpose of the `IAgent<TMessage>` interface is to avoid exposing the type parameter for the state, which is an implementation detail of the agent. In Listing 12.6, `agentStateful` is an instance of the `StatefulDataFlowAgent` agent, which receives a message containing the web address where it should download its content asynchronously. Then, the result of the operation is cached into the local state, `ImmutableDictionary<string,string>`, to avoid repeating identical operations. For example, the Google website is mentioned twice in the `urls` collections, but it’s downloaded only once. Ultimately, the content of each website is persisted in the local filesystem for the sake of the example. Notice that, apart from any side effects that occur when downloading and persisting the data, the implementation is side effect free. The changes in state are captured by always passing the state as an argument to the action function (or `Loop` function`)`. Listing 12.6 Agent based on TDF in action ``` List<string> urls = new List<string> { @"http://www.google.com", @"http://www.microsoft.com", @"http://www.bing.com", @"http://www.google.com" }; var agentStateful = **Agent.Start**(ImmutableDictionary<string,string>.Empty, **async** (**ImmutableDictionary**<string,string> state, string url) => { ① if (!state.TryGetValue(url, out string content)) using (var webClient = new WebClient()){ content = **await** webClient.DownloadStringTaskAsync(url); **await** File.WriteAllTextAsync(createFileNameFromUrl(url), content); return state.Add(url, content); ② } return state; ② }); urls.ForEach(url => agentStateful.**Post**(url)); ``` ### 12.4.1 Agent fold-over state and messages: Aggregate The current state of an agent is the result of reducing all the messages it has received so far using the initial state as an accumulator value, and then processing the function as a reducer. You can imagine this agent as a fold (aggregator) in time over the stream of messages received. Interestingly, the `StatefulDataFlowAgent` constructor shares a signature and behavior similar to the LINQ extension method `Enumerable.Aggregate`. For demonstration purposes, the following code swaps the agent construct from the previous implementation with its counterpart, the LINQ `Aggregate` operator: ``` urls.**Aggregate**(ImmutableDictionary<string,string>.Empty, **async** (state, url) => { if (!state.TryGetValue(url, out string content)) using (var webClient = new WebClient()) { content = **await** webClient.DownloadStringTaskAsync(url); **await** File.WriteAllTextAsync(createFileNamFromUrl(url), content); return state.Add(url, content); } return state; }); ``` As you can see, the core logic hasn’t changed. Using the `StatefulDataFlowAgent` constructor, which operates over message passing instead of in a collection, you implemented an asynchronous reducer similar to the LINQ `Aggregate` operator. ### 12.4.2 Agent interaction: a parallel word counter According to the *actor* definition from Carl Hewitt,^(1) one of the minds behind the actor model: “One actor is no actor. They come in systems.” This means that actors come in systems and communicate with each other. The same rule applies to agents. Let’s look at an example of using agents that interact with each other to group-count the number of times a word is present in a set of text files (figure 12.8). Let’s start with a simple stateless agent that takes a string message and prints it. You can use this agent to log the state of an application that maintains the order of the messages: ``` IAgent<string> printer = Agent.Start((string msg) => WriteLine($"{msg} on thread {Thread.CurrentThread.ManagedThreadId}")); ``` The output also includes the current thread ID to verify the multiple threads used. This listing shows the implementation of the agent system for the group-count of words. Listing 12.7 Word counter pipeline using agents ``` IAgent<string> reader = Agent.Start(async (string filePath) => { await printer.Send("reader received message"); ① var lines = await File.ReadAllLinesAsync(filePath); ② lines.ForEach(async line => await parser.Send(line)); ③ }); char[] punctuation = Enumerable.Range(0, 256).Select(c => (char)c) .Where(c => Char.IsWhiteSpace(c) || Char.IsPunctuation(c)).ToArray(); IAgent<string> parser = Agent.Start(async (string line) => { await printer.Send("parser received message"); ① foreach (var word in line.Split(punctuation)) ④ await counter.Send(word.ToUpper()); }); IReplyAgent<string, (string, int)> counter = Agent.Start(ImmutableDictionary<string, int>.Empty, (state, word) => { printer.Post("counter received message"); ① int count; if (state.TryGetValue(word, out count)) return state.Add(word, count++); ⑤ else return state.Add(word, 1); }, (state, word) => (state, (word, state[word]))); foreach (var filePath in Directory.EnumerateFiles(@"myFolder", "*.txt")) reader.Post(filePath); var wordCount_This = await counter.Ask("this"); ⑥ var wordCount_Wind = await counter.Ask("wind"); ⑥ ```  Figure 12.8 Simple interaction between agents by exchanging messages. The agent programming model promotes the single responsibility principle to write code. Note the counter agent provides a two-way communication, so the user can ask (interrogate) the agent, sending a message at any given time and receiving a reply in the form of a channel, which acts as asynchronous callback. When the operation completes, the callback provides the result. The system is composed of three agents that communicate with each other to form a chain of operations: * The `reader` agent * The `parser` agent * The `counter` agent The word-counting process starts with a `for-each` loop to send the file paths of a given folder to the first `reader` agent. This agent reads the text from a file, and then sends each line of the text to the `parser` agent: ``` var lines = **await** File.ReadAllLinesAsync(filePath); lines.ForEach(**async** line => **await** parser.Send(line)); ``` The `parser` agent splits the text message into single words, and then passes each of those words to the last `counter` agent: ``` lines.Split(punctuation).ForEach(**async** word => **await** counter.Send(word.ToUpper())); ``` The `counter` agent is a stateful agent that does the work of maintaining the count of the words as they’re updated. An `ImmutableDictionary` collection defines the state of the `counter` agent that stores the words along with the count for the number of times each word has been found. For each message received, the `counter` agent checks whether the word exists in an internal state `ImmutableDictionary<string, int>` to either increment the existing counter or start a new one. The interesting factor of the `counter` agent is the ability to respond to the caller asynchronously using the `Ask` method. You can interrogate the agent for the count of a particular word at any time. The interface `IReplyAgent` is the result of expanding the functionality of the previous interface `IAgent` with the `Ask` method: ``` interface IReplyAgent<TMessage, TReply> : IAgent<TMessage> { Task<TReply> Ask(TMessage message); } ``` Listing 12.8 shows the implementation of the two-way communication `StatefulReplyDataFlowAgent` agent, in which the internal state is represented by a single polymorphic mutable variable. This agent has two different behaviors: * One to handle the `Send` a message method. * One to handle the `Ask` method. The `Ask` method sends a message and then waits asynchronously for a response. These behaviors are passed in the form of generic `Func` delegates into the agent’s constructor. The first function (`Func<TState, TMessage, Task<TState>>)` processes each message in combination with the current state and updates it accordingly. This logic is identical to the agent `StatefulDataFlowAgent`. Conversely, the second function (`Func<TState`, `TMessage`, `Task<(TState`, `TReply)>>`) handles the incoming messages, computes the agent’s new state, and ultimately replies to the sender. The output type of this function is a tuple, which contains the state of the agent, including a handle (callback) that acts as response (reply). The tuple is wrapped into a `Task`*type to be awaited without blocking, as with any asynchronous function.* *When creating the message `Ask` to interrogate the agent, the sender passes an instance of `TaskCompletionSource<TReply>` into the payload of the message, and a reference is returned by the `Ask` function to the caller. This object, `TaskCompletionSource,` is fundamental for providing a channel to communicate asynchronously back to the sender through a callback, and the callback is notified from the agent when the result of the computation is ready. This model effectively generates two-way communication. *Listing 12.8 Stateless agent in C# using TDF ``` class StatefulReplyDataFlowAgent<TState, TMessage, TReply> : ① IReplyAgent<TMessage, TReply> { private TState state; private readonly ActionBlock<(TMessage, ② Option<TaskCompletionSource<TReply>>)> actionBlock; public StatefulReplyDataFlowAgent(TState initialState, Func<TState, TMessage, Task<TState>> projection, Func<TState, TMessage, Task<(TState, TReply)>> ask, CancellationTokenSource cts = null) ③ { state = initialState; var options = new ExecutionDataFlowBlockOptions { CancellationToken = cts?.Token ?? CancellationToken.None }; actionBlock = new ActionBlock<(TMessage, Option<TaskCompletionSource<TReply>>)>( async message => { (TMessage msg, Option<TaskCompletionSource<TReply>> replyOpt) = message; await replyOpt.Match( ④ None: async () => state = await projection(state, msg), ⑤ Some: async reply => { ⑥ (TState newState, TReply replyresult) = await ask(state, msg); state = newState; reply.SetResult(replyresult); }); }, options); } public Task<TReply> Ask(TMessage message) { var tcs = new TaskCompletionSource<TReply>(); ⑦ actionBlock.Post((message, Option.Some(tcs))); return tcs.Task; ⑦ } public Task Send(TMessage message) => actionBlock.SendAsync((message, Option.None)); } ``` To enable `StatefulReplyDataFlowAgent` to handle both types of communications, one-way `Send` and two-way `Ask`, the message is constructed by including a `TaskCompletionSource` option type. In this way, the agent infers if a message is either from the `Post` method, with `None` `TaskCompletionSource`, or from the `Ask` method, with `Some` `TaskCompletionSource`. The `Match` extension method of the `Option` type, `Match<T, R>(None : Action<T>, Some(item) : Func<T,R>(item))`, is used to branch out to the corresponding behavior of the agent. ## 12.5 A parallel workflow to compress and encrypt a large stream In this section, you’ll build a complete asynchronous and parallelized workflow combined with the agent programming model to demonstrate the power of the TDF library.This example uses a combination of TDF blocks and the `StatefulDataFlowAgent` agent linked to work as a parallel pipeline. The purpose of this example is to analyze and architect a real case application. It then evaluates the challenges encountered during the development of the program, and examines how TDF can be introduced in the design to solve these challenges. TDF processes the blocks that compose a workflow at different rates and in parallel. More importantly, it efficiently spreads the work out across multiple CPU cores to maximize the speed of computation and overall scalability. This is particularly useful when you need to process a large stream of bytes that could generate hundreds, or even thousands, of chunks of data. ### 12.5.1 Context: the problem of processing a large stream of data Let’s say that you need to compress a large file to make it easier to persist or transmit over the network, or that a file’s content must be encrypted to protect that information. Often, both compression and encryption must be applied. These operations can take a long time to complete if the full file is processed all at once. Furthermore, it’s challenging to move a file, or stream data, across the network, and the complexity increases with the size of the file, due to external factors, such as latency and unpredictable bandwidth. In addition, if the file is transferred in one transaction, and something goes wrong, then the operation tries to resend the entire file, which can be time- and resource-consuming. In the following sections, you’ll tackle this problem step by step. In .NET, it isn’t easy to compress a file larger than 4 GB, due to the framework limitation on the size of data to compress. Due to the maximum addressable size for a 32-bit pointer, if you create an array over 4 GB, an `OutOfMemoryArray` exception is thrown. Starting with .NET 4.5 and for 64-bit platforms, the option `gcAllowVeryLargeObjects` ([`mng.bz/x0c4`](http://mng.bz/x0c4)) is available to enable arrays greater than 4 GB. This option allows 64-bit applications to have a multidimensional array with size `UInt32.MaxValue` (4,294,967,295) elements. Technically, you can apply the standard GZip compression that’s used to compress streams of bytes to data larger than 4 GB; but the GZip distribution doesn’t support this by default. The related .NET `GZipStream` class inheritably has a 4 GB limitation. How can you compress and encrypt a large file without being constrained by the 4 GB limit imposed by the framework classes? A practical solution involves using a chunking routine to chop the stream of data. Chopping the stream of data makes it easier to compress and/or encrypt each block individually and ultimately write the block content to an output stream. The chunking technique splits the data, generally into chunks of the same size, applies the appropriate transformation to each chunk (compression before encryption), glues the chunks together in the correct order, and compresses the data. It’s vital to guarantee the correct order of the chunks upon reassembly at the end of the workflow. Due to the intensive I/O asynchronous operations, the packages might not arrive in the correct sequence, especially if the data is transferred across the network. You must verify the order during reassembly (figure 12.9).  Figure 12.9 The transform blocks process the messages in parallel. The result is sent to the next block when the operation completes. The aggregate agent’s purpose is to maintain the integrity of the order of the messages, similar to the `AsOrdered` PLINQ extension method. The opportunity for parallelism fits naturally in this design, because the chunks of the data can be processed independently. Listing 12.9 shows the full implementation of the parallel compression–encryption workflow. Note that in the source code, you can find the reverse workflow to decrypt and decompress the data, as well as use asynchronous helper functions for compressing and encrypting bytes array. The function `CompressAndEncrypt` takes as an argument the source and destination streams to process, the `chunkSize` argument defines the size in which the data is split (the default is 1 MB if no value is provided), and `CancellationTokenSource` stops the dataflow execution at any point. If no `CancellationTokenSource` is provided, a new token is defined and propagated through the dataflow operations. The core of the function consists of three TDF building blocks, in combination with a stateful agent that completes the workflow. The `inputBuffer` is a `BufferBlock` type that, as the name implies, buffers the incoming chunks of bytes read from the source stream, and holds these items to pass them to the next blocks in the flow, which is the linked `TransformBlock` `compressor` (the code to note is in bold). Listing 12.9 Parallel stream compression and encryption using TDF ``` async Task CompressAndEncrypt( Stream streamSource, Stream streamDestination, long chunkSize = 1048576, CancellationTokenSource cts = null) { cts = cts ?? new CancellationTokenSource(); ① var compressorOptions = new ExecutionDataflowBlockOptions { **MaxDegreeOfParallelism** = Environment.ProcessorCount, **BoundedCapacity** = 20, ② CancellationToken = cts.Token }; var inputBuffer = new **BufferBlock**<CompressingDetails>( new DataflowBlockOptions { CancellationToken = cts.Token, **BoundedCapacity** = 20 }); ② var compressor = new **TransformBlock**<CompressingDetails, CompressedDetails>(**async** details => { var compressedData = **await** IOUtils.Compress(details.Bytes); ③ return details.ToCompressedDetails(compressedData); ④ }, compressorOptions); var encryptor = new **TransformBlock**<CompressedDetails, EncryptDetails>( **async** details => { byte[] data = IOUtils.CombineByteArrays(details.CompressedDataSize, ➥ details.ChunkSize, details.Bytes); ⑤ var encryptedData = **await** IOUtils.Encrypt(data); ⑥ return details.ToEncryptDetails(encryptedData); ④ }, compressorOptions); var asOrderedAgent = **Agent**.Start((new Dictionary<int, EncryptDetails>(),0), **async**((Dictionary<int,EncryptDetails>,int)state,EncryptDetails msg)=>{ ⑦ Dictionary<int, EncryptDetails> details, int lastIndexProc) = state; details.Add(msg.Sequence, msg); while (details.ContainsKey(lastIndexProc+1)) { msg = details[lastIndexProc + 1]; await streamDestination.WriteAsync(msg.EncryptedDataSize, 0, msg.EncryptedDataSize.Length); await streamDestination.WriteAsync(msg.Bytes, 0, msg.Bytes.Length); ⑧ lastIndexProc = msg.Sequence; details.Remove(lastIndexProc); ⑨ } return (details, lastIndexProc); }, cts); var writer = new ActionBlock<EncryptDetails>(**async** details => **await** asOrderedAgent.Send(details), compressorOptions); ⑫ var linkOptions = new DataflowLinkOptions { PropagateCompletion = true }; inputBuffer.**LinkTo**(compressor, linkOptions); ⑬ compressor.**LinkTo**(encryptor, linkOptions); ⑬ encryptor.**LinkTo**(writer, linkOptions); ⑬ long sourceLength = streamSource.Length; byte[] size = BitConverter.GetBytes(sourceLength); **await** streamDestination.WriteAsync(size, 0, size.Length); ⑭ chunkSize = Math.Min(chunkSize, sourceLength); ⑮ int indexSequence = 0; while (sourceLength > 0) { byte[] data = new byte[chunkSize]; int readCount = **await** streamSource.ReadAsync(data, 0, data.Length); ⑯ byte[] bytes = new byte[readCount]; Buffer.BlockCopy(data, 0, bytes, 0, readCount); var compressingDetails = new CompressingDetails { Bytes = bytes, ChunkSize = BitConverter.GetBytes(readCount), Sequence = ++indexSequence }; **await** inputBuffer.SendAsync(compressingDetails); ⑰ sourceLength -= readCount; ⑱ if (sourceLength < chunkSize) chunkSize = sourceLength; ⑱ if (sourceLength == 0) inputBuffer.Complete(); ⑲ } **await** inputBuffer.Completion.ContinueWith(task => compressor.Complete()); **await** compressor.Completion.ContinueWith(task => encryptor.Complete()); **await** encryptor.Completion.ContinueWith(task => writer.Complete()); **await** writer.Completion; **await** streamDestination.FlushAsync(); } ``` The bytes read from the stream are sent to the buffer block by using the `SendAsync` method: ``` var compressingDetails = new CompressingDetails { Bytes = bytes, ChunkSize = BitConverter.GetBytes(chunkSize), Sequence = ++indexSequence }; **await** buffer.**SendAsync**(compressingDetails); ``` Each chunk of bytes read from the stream source is wrapped into the data structure’s `CompressingDetails`, which contains the additional information of byte-array size. The monotonic value is later used in the sequence of chunks generated to preserve the order. A *monotonic value* is a function between ordered sets that preserves or reverses the given value, and the value always either decreases or increases. The order of the block is important both for a correct compression–encryption operation and for correct decryption and decompression into the original shape. In general, if the purpose of the block is purely to forward item operations from one block to several others, then you don’t need the `BufferBlock`. But in the case of reading a large or continuous stream of data, this block is useful for taming the backpressure generated from the massive amount of data partitioned to the process by setting an appropriate `BoundedCapacity`. In this example, the `BoundedCapacity` is restricted to a capacity of 20 items. When there are 20 items in this block, it will stop accepting new items until one of the existing items passes to the next block. Because the dataflow source of data originated from asynchronous I/O operations, there’s a risk of potentially large amounts of data to process. It’s recommended that you limit internal buffering to throttle the data by setting the `BoundedCapacity` property in the options defined when constructing the `BufferBlock`. The next two block types are compression transformation and encryption transformation. During the first phase (compression), the `TransformBlock` applies the compression to the chunk of bytes and enriches the message received `CompressingDetails` with the relative data information, which includes the compressed byte array and its size. This information persists as part of the output stream accessible during the decompression. The second phase (encryption) enciphers the chunk of compressed byte array and creates a sequence of bytes resulting from the composition of three arrays: `CompressedDataSize`, `ChunkSize`, and data array. This structure instructs the decompression and decryption algorithms to target the right portion of bytes to revert from the stream. ### 12.5.2 Ensuring the order integrity of a stream of messages The TDF documentation guarantees that `TransformBlock` will propagate the messages in the same order in which they arrived. Internally, `TransformBlock` uses a reordering buffer to fix any out-of-order issues that might arise from processing multiple messages concurrently. Unfortunately, due to the high number of asynchronous and intensive I/O operations running in parallel, keeping the integrity of the message order doesn’t apply to this case. This is why you implemented the additional sequential ordering preservation using monotonically values. If you decide to *send* or *stream* the data over the network, then the guarantee of delivering the packages in the correct sequence is lost, due to variables such as the unpredictable bandwidth and unreliable network connection. To safeguard the order integrity when processing chunks of data, your final step in the workflow is the stateful `asOrderedAgent` agent. This agent behaves as a *multiplexer* by reassembling the items and persists them in the local filesystem, maintaining the correct sequence. The order value of the sequence is kept in a property of the `EncryptDetails` data structure, which is received by the agent as a message. The accuracy for the whole computation requires preservation of the order of the source sequence and the partitions to ensure that the order is consistent at merge time. The state of this agent is preserved using a tuple. The first item of the tuple is a collection `Dictionary<int, EncryptDetails>`, where the key represents the sequence value of the original order by which the data was sent. The second item, `lastIndexProc`, is the index of the last item processed, which prevents reprocessing the same chunks of data more than once. The body of `asOrderedAgent` runs the `while`*loop that uses this value `lastIndexProc` and makes sure that the processing of the chunks of data starts from the last item unprocessed. The loop continues to iterate until the order of the items is continued; otherwise it breaks out from the loop and waits for the next message, which might complete the missing gap in the sequence.* *The `asOrderedAgent` agent is plugged into the workflow through the TDF `ActionBlock` writer, which sends it to the `EncryptDetails` data structure for the final work. ### 12.5.3 Linking, propagating, and completing The TDF blocks in the compress-encrypt workflow are linked using the `LinkTo` extension method, which by default propagates only data (messages). But if the workflow is linear, as in this example, it’s good practice to share information among the blocks through an automatic notification, such as when the work is terminated or eventual errors accrue. This behavior is achieved by constructing the `LinkTo` method with the `DataFlowLinkOptions` optional argument and the `PropagateCompletion` property set to true. Here’s the code from the previous example with this option built in: ``` var **linkOptions** = new DataFlowLinkOptions { PropagateCompletion = true }; inputBuffer.LinkTo(compressor, **linkOptions**); compressor.LinkTo(encryptor, **linkOptions**); encryptor.LinkTo(writer, **linkOptions**); ``` The `PropagateCompletion` optional property informs the dataflow block to automatically propagate its results and exceptions to the next stage when it completes. This is accomplished by calling the `Complete` method when the buffer block triggers the complete notification upon reaching the end of the stream: ``` if (sourceLength < chunkSize) chunkSize = sourceLength; if (sourceLength == 0) buffer.**Complete**(); ``` Then all the dataflow blocks are announced in a cascade as a chain that the process has completed: ``` **await** inputBuffer.Completion.ContinueWith(task => compressor.**Complete**()); **await** compressor.Completion.ContinueWith(task => encryptor.**Complete**()); **await** encryptor.Completion.ContinueWith(task => writer.**Complete**()); **await** writer.**Completion**; ``` Ultimately, you can run the code as follows: ``` using (var streamSource = new FileStream(sourceFile, FileMode.OpenOrCreate, FileAccess.Read, FileShare.None, useAsync: true)) using (var streamDestination = new FileStream(destinationFile, FileMode.Create, FileAccess.Write, FileShare.None, useAsync: true)) **await** **CompressAndEncrypt**(streamSource, streamDestination) ``` Table 12.1 shows the benchmarks for compressing and encrypting different file sizes, including the inverted operation of decrypting and decompressing. The benchmark result is the average of each operation run three times. Table 12.1 Benchmarks for compressing and encrypting different file sizes | **File size in GB** | **Degree of parallelism** | **Compress-encrypt time in seconds** | **Decrypt-decompress time in seconds** | | --- | --- | --- | --- | | 3 | 1 | 524.56 | 398.52 | | 3 | 4 | 123.64 | 88.25 | | 3 | 8 | 69.20 | 45.93 | | 12 | 1 | 2249.12 | 1417.07 | | 12 | 4 | 524.60 | 341.94 | | 12 | 8 | 287.81 | 163.72 | ### 12.5.4 Rules for building a TDF workflow Here are few good rules and practices for successfully implementing TDF in your workflow: * *Do one thing, and do it well.* This is a principal of modern OOP, the *single responsibility principle* ([`en.wikipedia.org/wiki/Single_responsibility_principle`](https://en.wikipedia.org/wiki/Single_responsibility_principle))*.* The idea is that your block should perform only one action and should have only one reason to change. * *Design for composition*. In the OOP world, this is known as the *open closed principle* ([`en.wikipedia.org/wiki/Open/closed_principle`](https://en.wikipedia.org/wiki/Open/closed_principle)), where the dataflow building blocks are designed to be open for extension but closed to modification. * *DRY*. This principle (don’t repeat yourself) encourages you to write reusable code and reusable dataflow building block components. ### 12.5.5 Meshing Reactive Extensions (Rx) and TDF TDF and Rx (discussed in chapter 6) have important similarities, despite having independent characteristics and strengths, and these libraries complement each other, making them easy to integrate. TDF is closer to an agent-based programming model, focused on providing building blocks for message passing, which simplifies the implementation of parallel CPU- and I/O-intensive applications with high throughput and low latency, while also providing developers explicit control over how data is buffered. Rx is keener to the functional paradigm, providing a vast set of operators that predominantly focused on coordination and composition of event streams with a LINQ-based API. TDF has built-in support for integrating with Rx, which allows it to expose the source dataflow blocks as both observables and observers. The `AsObservable` extension method transforms TDF blocks into an observable sequence, which allows the output of the dataflow chain to flow efficiently into an arbitrary set of Reactive fluent extension methods for further processing. Specifically, the `AsObservable` extension method constructs an `IObservable<T>` for an `ISourceBlock<T>.` Let’s see the integration of Rx and TDF in action. In Listing 12.9, the last block of the parallel compress-encrypt stream dataflow is the stateful `asOrderedAgent`. The particularity of this component is the presence of an internal state that keeps track of the messages received and their order. As mentioned, the construct signature of a stateful agent is similar to the LINQ `Aggregate` operator, which in terms of Rx can be replaced with the RX `Observable.Scan` operator. This operator is covered in chapter 6. The following listing shows the integration between Rx and TDF by replacing the `asOrderedAgent` agent from the last block of the parallel compress-encrypt stream workflow. Listing 12.10 Integrating Reactive Extensions with TDF ``` inputBuffer.LinkTo(compressor, linkOptions); compressor.LinkTo(encryptor, linkOptions); encryptor.AsObservable() ① .Scan((new Dictionary<int, EncryptDetails>(), 0), (state, msg) => Observable.FromAsync(async() => { ② (Dictionary<int,EncryptDetails> details, int lastIndexProc) = state; details.Add(msg.Sequence, msg); while (details.ContainsKey(lastIndexProc + 1)) { msg = details[lastIndexProc + 1]; await streamDestination.WriteAsync(msg.EncryptedDataSize, 0, msg.EncryptedDataSize.Length); await streamDestination.WriteAsync(msg.Bytes, 0, msg.Bytes.Length); lastIndexProc = msg.Sequence; details.Remove(lastIndexProc); } return (details, lastIndexProc); }) .SingleAsync().Wait()) .SubscribeOn(TaskPoolScheduler.Default).Subscribe(); ③ ``` As you can see, you swapped the `asOrderedAgent` with the Rx `Observable.Scan` operator without changing the internal functionality. TDF blocks and Rx observable streams can be completed successfully or with errors, and the `AsObservable` method will translate the block completion (or fault) into the completion of the observable stream. But if the block faults with an exception, that exception will be wrapped in an `AggregateException` when it is passed to the observable stream. This is similar to how linked blocks propagate their faults. ## Summary * A system written using TPL Dataflow benefits from a multicore system because all the blocks that compose a workflow can run in parallel. * TDF enables effective techniques for running embarrassingly parallel problems, where many independent computations can be executed in parallel in an evident way. * TDF has built-in support for throttling and asynchrony, improving both I/O-bound and CPU-bound operations. In particular, it provides the ability to build responsive client applications while still getting the benefits of massively parallel processing. * TDF can be used to parallelize the workflow to compress and encrypt a large stream of data by processing blocks at different rates. * The combination and integration of Rx and TDF simplifies the implementation of parallel CPU- and I/O-intensive applications, while also providing developers explicit control over how data is buffered.*****````*******`````******``````*********
第三部分
应用现代并发编程模式
本书的第三部分和最后一部分允许你将迄今为止所学的所有函数式并发编程技术付诸实践。这些章节将成为你关于并发问题的问答的必备参考。
第十三章涵盖了使用函数式范式解决你在并发应用程序中可能遇到的常见和复杂问题的食谱。第十四章将带你详细了解一个可扩展且高性能的股票市场服务器应用程序的完整实现,该应用程序包括客户端的 iOS 和 WPF 版本。
在本书中学到的函数式范式原则将应用于设计和架构决策,以及代码开发中,以实现高性能和可扩展的解决方案。你将在本节中看到,将函数式原则应用于减少错误并提高可维护性所带来的积极影响。
13
成功并发编程的食谱和设计模式
本章涵盖
- 十二个解决并行编程常见问题的代码食谱
本章中提出的 12 个食谱具有广泛的应用。当你面临类似问题并需要快速答案时,可以将核心思想作为参考。这些材料展示了本书中涵盖的函数式并发抽象如何通过开发复杂且丰富的函数以及相对较少的代码行数来解决复杂问题。我将食谱的实现尽可能简化,因此你有时需要处理取消和异常处理。
本章将向你展示如何将迄今为止所学的一切结合起来,使用函数式编程抽象作为粘合剂来编写高效且性能良好的程序。到本章结束时,你将拥有一套用于解决常见并发编程问题的有用且可重用的工具。
每个食谱都是用 C#或 F#编写的;对于大多数代码实现,你可以在在线可下载的代码中找到两种版本。此外,请记住,F#和 C#是具有互操作支持以相互交互的.NET 编程语言。你可以轻松地在 F#中使用 C#程序,反之亦然。
13.1 回收对象以减少内存消耗
在本节中,你将实现一个可重用的异步对象池。在回收对象有助于减少内存消耗的情况下应使用此对象池。最小化 GC 代数数量可以使你的程序享受更好的性能速度。图 13.1,重复自第十二章,展示了如何将第十二章中的并发生产者/消费者模式(从列表 12.9)应用于并行压缩和加密大文件。

图 13.1 Transform块并行处理消息。当操作完成时,结果被发送到下一个块。Aggregate 代理的目的是维护消息的顺序完整性,类似于AsOrdered PLINQ 扩展方法。
来自列表 12.9 的函数CompressAndEncrypt将大文件分割成一系列字节数组块,由于高内存消耗,这会产生大量的 GC 代数。当内存压力达到需要更多资源的触发点时,每个内存块都会被创建、处理和收集。
这种创建和销毁字节数组的高容量操作会导致许多 GC 代数,这会负面影响应用程序的整体性能。事实上,程序在多线程方式下为其整个执行分配了相当数量的内存缓冲区(字节数组),这意味着多个线程可以同时分配相同数量的内存。考虑到每个缓冲区是 4,096 字节的内存,并且有 25 个线程同时运行;在这种情况下,大约有 102,400 字节同时在堆中分配。此外,当每个线程完成其执行时,许多缓冲区超出作用域,迫使 GC 启动一个代数。这对性能不利,因为应用程序正承受着沉重的内存管理压力。
13.1.1 解决方案:异步回收对象池
为了优化具有强烈内存消耗的并发应用程序的性能,回收那些否则会被系统垃圾回收的对象。在并行压缩和加密流示例中,你希望重用生成的相同的字节缓冲区(字节数组),而不是创建新的。这可以通过使用ObjectPool类来实现,该类旨在提供一个缓存的对象池,用于回收未使用的项目。这种对象的重用避免了昂贵的资源获取和释放,最小化了潜在的内存分配。具体来说,在高度并发的示例中,你需要一个线程安全和非阻塞(基于任务的)并发对象池(图 13.2)。

图 13.2 对象池可以异步处理来自多个消费者的多个并发请求,以重用对象。消费者完成工作后,将对象送回对象池。内部,对象池使用给定的工厂代理生成一个对象队列。然后,这些对象被回收以减少内存消耗和新的实例化成本。
在列表 13.1 中,ObjectPoolAsync的实现基于一个使用BufferBlock作为构建块的 TDF。ObjectPoolAsync预先初始化一组对象,以便应用程序在需要时使用和重用。此外,TDF 本质上是线程安全的,同时提供异步、非阻塞的语义。
列表 13.1 使用 TDF 的异步对象池实现
public class ObjectPoolAsync<T> :IDisposable
{
private readonly BufferBlock<T> buffer; ①
private readonly Func<T> factory; ②
private readonly int msecTimeout;
public ObjectPoolAsync(int initialCount, Func<T> factory,
➥ CancellationToken cts, int msecTimeout = 0)
{
this.msecTimeout = msecTimeout;
buffer = new BufferBlock<T>( ①
new DataflowBlockOptions { CancellationToken = cts });
this.factory = () => factory(); ②
for (int i = 0; i < initialCount; i++)
buffer.Post(this.factory()); ③
}
public Task<bool> PutAsync(T item) => buffer.SendAsync(item); ④
public Task<T> GetAsync(int timeout = 0) ⑤
{
var tcs = new TaskCompletionSource<T>();
buffer.ReceiveAsync(TimeSpan.FromMilliseconds(msecTimeout))
.ContinueWith(task =>
{
if (task.IsFaulted)
if (task.Exception.InnerException is TimeoutException)
tcs.SetResult(factory());
else
tcs.SetException(task.Exception);
else if (task.IsCanceled)
tcs.SetCanceled();
else
tcs.SetResult(task.Result);
});
return tcs.Task;
}
public void Dispose() => buffer.Complete();
}
ObjectPoolAsync接受创建对象的初始数量和一个工厂委托构造函数作为参数。ObjectPoolAsync公开两个函数来协调对象的回收:
-
PutAsync—可以将一个项目异步地Put入池中。 -
GetAsync—可以异步地从池中取出一个项目。
在可下载的源代码中,您可以找到更新为使用ObjectPoolAsync的CompressAndEncrypt程序的完整解决方案。图 13.3 是程序原始版本和新版本之间不同文件大小 GC 代数的图形比较。

图 13.3 比较第十二章的CompressAndEncrypt程序,该程序使用和未使用AsyncObjectPool实现了对不同大文件(1 GB、2 GB 和 3 GB)的处理。与原始版本相比,使用对象池的实现具有更少的 GC 代数。最小化 GC 代数可以带来更好的性能。
图表中显示的结果展示了使用ObjectPoolAsync实现的CompressAndEncrypt程序如何显著减少 GC 代数,从而加快整体应用程序的性能。在八核机器上,CompressAndEncrypt的新版本大约快 8%。
13.2 自定义并行 Fork/Join 操作符
在本节中,您将实现一个可重用的扩展方法来并行化 Fork/Join 操作。假设您在程序中检测到一段代码,如果使用分治模式并行执行将有助于提高性能。您决定重构代码以使用并发 Fork/Join 模式(图 13.4)。并且您检查程序越多,出现的类似模式就越多。

图 13.4 Fork/Join 模式将任务分割成可以并行独立执行的子任务。当操作完成时,子任务再次合并。这种模式常用于实现数据并行化并不偶然。事实上,存在明显的相似之处。
不幸的是,在.NET 中,没有内置的并行 Fork/Join 扩展方法支持按需重用。但你可以创建这样的方法以及更多,以拥有一个可重用且灵活的操作符,它执行以下操作:
-
分割数据
-
并行应用 Fork/Join 模式
-
可选地允许您配置并行度
-
使用归约函数合并结果
.NET 操作符 Task.WhenAll 和 F# 的 Async.Parallel 可以并行组合一组给定的任务;但这两个操作符不提供聚合(或归约)功能来连接结果。此外,当想要控制并行度时,它们缺乏可配置性。为了得到你想要的操作符,你需要一个定制的解决方案。
13.2.1 解决方案:组成形成 Fork/Join 模式的步骤管道
使用 TDF,你可以将不同的构建块组合在一起作为一个管道。你可以使用管道来定义 Fork/Join 模式的步骤(图 13.5),其中 Fork 步骤并行运行一系列任务,然后接下来的步骤合并结果,最后一步应用归约块以产生最终输出。对于工作流程的后续步骤,该步骤聚合结果,你需要一个对象来维护之前步骤的状态。在这种情况下,你使用第十二章中基于 TDF 构建的基于代理的块。
Fork/Join 模式作为对泛型 IEnumerable 的扩展方法实现,以便从代码中以流畅的方式方便地访问,如 列表 13.2 所示(请注意,代码部分为粗体)。

图 13.5 使用 TDF 实现的 Fork/Join 模式,其中每个计算步骤都使用不同的数据流块定义
列表 13.2 使用 TDF 的并行 ForkJoin
public static async Task<R> **ForkJoin**<T1, T2, R>(
this IEnumerable<T1> source,
Func<T1, Task<IEnumerable<T2>>> map, ①
Func<R, T2, Task<R>> aggregate, ①
R initialState, CancellationTokenSource cts = null,
int partitionLevel = 8, int boundCapacity = 20) ②
{
cts = cts ?? new CancellationTokenSource();
var blockOptions = new ExecutionDataflowBlockOptions { ③
MaxDegreeOfParallelism = partitionLevel,
BoundedCapacity = boundCapacity,
CancellationToken = cts.Token
};
var inputBuffer = new BufferBlock<T1>( ④
new DataflowBlockOptions {
CancellationToken = cts.Token,
BoundedCapacity = boundCapacity
});
var mapperBlock = new TransformManyBlock<T1, T2>
➥ (map, blockOptions); ④
var reducerAgent = Agent.Start(initialState, aggregate, cts); ④
var linkOptions = new DataflowLinkOptions{PropagateCompletion=true};
inputBuffer.LinkTo(mapperBlock, linkOptions); ⑤
IDisposable disposable = mapperBlock.AsObservable()
.Subscribe(async item => await reducerAgent.Send(item)); ⑥
foreach (var item in source)
await inputBuffer.SendAsync(item); ⑦
inputBuffer.Complete();
var tcs = new TaskCompletionSource<R>();
await inputBuffer.Completion.ContinueWith(task =>
mapperBlock.Complete());
await mapperBlock.Completion.ContinueWith(task => { ⑧
var agent = reducerAgent as StatefulDataflowAgent<R, T2>;
disposable.Dispose();
tcs.SetResult(agent.State);
});
return await tcs.Task;
}
ForkJoin 扩展方法接受一个参数作为要处理的 IEnumerable 源,用于映射函数,以转换其项,以及一个聚合(归约)函数,用于合并来自映射计算的所有的结果。参数 initialState 是聚合函数所需的初始状态值。但如果结果类型 T2 可以组合(因为满足单调律),你可以修改该方法以使用具有零初始状态的归约函数,如列表 5.10 所解释的。
基础数据流块被链接起来形成一个管道。有趣的是,mapperBlock 使用 AsObservable 扩展方法被转换为一个 Observable,然后订阅它以在输出具体化时向 reducerAgent 发送消息。值 partitionLevel 和 boundCapacity 分别用于设置并行度的大小和边界容量。
这里有一个如何利用 ForkJoin 操作符的简单示例:
Task<long> sum = Enumerable.Range(1, 100000)
.**ForkJoin**<int, long, long>(
async x => new[] { (long)x * x },
async (state, x) => state + x, 0L);
之前的代码使用 Fork/Join 模式计算了从 1 到 100,000 所有数字的平方和。
13.3 并行化具有依赖关系的任务:设计代码以优化性能
让我们假设你需要编写一个工具,该工具可以执行一系列异步任务——每个任务都有不同的依赖关系,这些依赖关系会影响操作的顺序。你可以使用顺序和命令式执行来解决这个问题;但如果你想最大化性能,顺序操作是不够的。相反,你必须构建可以并行运行的任务。许多并发问题可以被视为具有依赖关系的静态原子操作集合,这些依赖关系存在于它们的输入和输出之间。操作完成后,输出被用作其他依赖操作的输入。为了优化性能,这些任务需要根据依赖关系进行调度,并且算法必须优化以在必要时串行运行依赖任务,尽可能并行运行。
你需要一个可重用的组件,该组件可以并行运行一系列任务,确保所有可能影响操作顺序的依赖关系都得到尊重。你如何创建一个编程模型,该模型可以暴露执行效率高的一系列操作的底层并行性,这些操作要么并行执行,要么根据与其他操作的依赖关系以串行方式执行?
13.3.1 解决方案:实现任务依赖图
解决方案被称为有向无环图(DAG),其目的是通过将操作分解为一系列具有定义依赖关系的原子任务来形成一个图。图的无环性质很重要,因为它消除了任务之间发生死锁的可能性,前提是任务是真正原子的。在指定图时,理解任务之间的所有依赖关系很重要,特别是可能导致死锁或竞争条件的隐藏依赖关系。图 13.6 是一个典型的以图形形状的数据结构示例,可以用来表示图中操作之间的调度约束。图是计算机科学中一个非常强大的数据结构,它产生了强大的算法。

图 13.6 图是由边连接的顶点集合。在这个有向无环图的表示中,节点 1 依赖于节点 4 和 5,节点 2 依赖于节点 5,节点 3 依赖于节点 5 和 6,依此类推。
你可以将 DAG 结构作为策略应用于并行运行任务,同时尊重依赖关系的顺序以提高性能。你可以使用 F# 的 MailboxProcessor 定义此图结构,它为注册执行的任务保持内部状态,这些任务以边依赖的形式存在。
以下示例使用 F# 的 MailboxProcessor 作为实现并行操作依赖的有向无环图的完美候选。首先,让我们定义用于管理任务和运行其依赖关系的区分联合。
列表 13.3 协调任务执行的消息类型和数据结构
type TaskMessage = ①
| AddTask of int * TaskInfo
| QueueTask of TaskInfo
| ExecuteTasks
and TaskInfo = ②
{ Context : System.Threading.ExecutionContext
Edges : int array; Id : int; Task : Func<Task>
EdgesLeft : int option; Start : DateTimeOffset option
End : DateTimeOffset option }
TaskMessage 类型表示发送到 ParallelTasksDAG 的底层代理的消息案例,该代理在 列表 13.4 中实现。这些消息用于任务协调和依赖同步。TaskInfo 类型包含并跟踪 DAG 执行期间注册的任务的详细信息,包括依赖边。执行上下文 (mng.bz/2F9o) 被捕获以在延迟执行期间访问信息,例如当前用户、与执行逻辑线程相关联的任何状态、代码访问安全信息等。当事件触发时,会发布执行时间的开始和结束。
列表 13.4 DAG F# 代理以并行化操作执行
type ParallelTasksDAG() =
let onTaskCompleted = new Event<TaskInfo>() ①
let dagAgent = new MailboxProcessor<TaskMessage>(fun inbox ->
let rec loop (tasks : Dictionary<int, TaskInfo>) ②
(edges : Dictionary<int, int list>) = async { ②
let! msg = inbox.Receive() ③
match msg with
| ExecuteTasks -> ④
let fromTo = new Dictionary<int, int list>()
let ops = new Dictionary<int, TaskInfo>() ⑤
for KeyValue(key, value) in tasks do ⑥
let operation =
{ value with EdgesLeft = Some(value.Edges.Length) }
for from in operation.Edges do
let exists, lstDependencies = fromTo.TryGetValue(from)
if not <| exists then
fromTo.Add(from, [ operation.Id ])
else fromTo.[from] <- (operation.Id :: lstDependencies)
ops.Add(key, operation)
ops |> Seq.iter (fun kv -> ⑥
match kv.Value.EdgesLeft with
| Some(n) when n = 0 -> inbox.Post(QueueTask(kv.Value))
| _ -> ())
return! loop ops fromTo
| QueueTask(op) -> ⑦
Async.Start <| async { ⑦
let start = DateTimeOffset.Now
match op.Context with ⑧
| null -> op.Task.Invoke() |> Async.AwaitATsk
| ctx -> ExecutionContext.Run(ctx.CreateCopy(), ⑨
(fun op -> let opCtx = (op :?> TaskInfo)
opCtx.Task.Invoke().ConfigureAwait(false)),
➥ taskInfo)
let end' = DateTimeOffset.Now
onTaskCompleted.Trigger { op with Start = Some(start)
End = Some(end') } ⑫
let exists, deps = edges.TryGetValue(op.Id)
if exists && deps.Length > 0 then
let depOps = getDependentOperation deps tasks []
edges.Remove(op.Id) |> ignore
depOps |> Seq.iter (fun nestedOp ->
inbox.Post(QueueTask(nestedOp))) }
return! loop tasks edges
| AddTask(id, op) -> tasks.Add(id, op) ⑬
return! loop tasks edges }
loop (new Dictionary<int, TaskInfo>(HashIdentity.Structural))
(new Dictionary<int, int list>(HashIdentity.Structural)))
[<CLIEventAttribute>]
member this.OnTaskCompleted = onTaskCompleted.Publish ⑫
member this.ExecuteTasks() = dagAgent.Post ExecuteTasks ⑭
member this.AddTask(id, task, [<ParamArray>] edges : int array) =
let data = { Context = ExecutionContext.Capture()
Edges = edges; Id = id; Task = task
NumRemainingEdges = None; Start = None; End = None }
dagAgent.Post(AddTask(id, data)) ⑮
函数 AddTask 的目的是注册一个任务,包括任意依赖边。此函数接受一个唯一 ID、必须执行的功能任务以及表示其他已注册任务 ID 的边集,所有这些都必须在当前任务执行之前完成。如果数组为空,则表示没有依赖。名为 dagAgent 的 MailboxProcessor 保持注册任务在当前状态 tasks 中,这是一个 ID 与其详细信息之间的映射(tasks : Dictionary<int, TaskInfo>)。代理还保持每个任务 ID 的边依赖状态(edges : Dictionary<int, int list>)。Dictionary 集合是可变的,因为 ParallelTasksDAG 的执行过程中状态会发生变化,并且因为它们继承自代理内部的线程安全性。当代理收到启动执行的通知时,该过程的一部分涉及验证所有边依赖都已注册并且图中没有循环。此验证步骤可在可下载源代码中 ParallelTasksDAG 的完整实现中找到。以下代码是 C# 示例,它引用并消耗 F# 库以运行 ParallelTasksDAG。注册的任务反映了 图 13.6 中的依赖关系:
Func<int, int, Func<Task>> action = (id, delay) => async () => {
Console.WriteLine($"Starting operation{id} in Thread Id
{Thread.CurrentThread.ManagedThreadId} . . . ");
await Task.Delay(delay);
};
var dagAsync = new DAG.ParallelTasksDAG();
dagAsync.OnTaskCompleted.Subscribe(op =>
Console.WriteLine($"Operation {op.Id} completed in Thread Id { Thread.CurrentThread.ManagedThreadId}"));
dagAsync.AddTask(1, action(1, 600), 4, 5);
dagAsync.AddTask(2, action(2, 200), 5);
dagAsync.AddTask(3, action(3, 800), 6, 5);
dagAsync.AddTask(4, action(4, 500), 6);
dagAsync.AddTask(5, action(5, 450), 7, 8);
dagAsync.AddTask(6, action(6, 100), 7);
dagAsync.AddTask(7, action(7, 900));
dagAsync.AddTask(8, action(8, 700));
dagAsync.ExecuteTasks();
辅助函数的 action 目的是在任务开始时打印,指示当前线程 Id 作为多线程功能的参考。将事件 OnTaskCompleted 注册为在控制台打印每个任务完成打印时通知,包括任务 ID 和当前线程 Id。以下是调用 ExecuteTasks 方法时的输出:
Starting operation 8 in Thread Id 23...
Starting operation 7 in Thread Id 24...
Operation 8 Completed in Thread Id 23
Operation 7 Completed in Thread Id 24
Starting operation 5 in Thread Id 23...
Starting operation 6 in Thread Id 25...
Operation 6 Completed in Thread Id 25
Starting operation 4 in Thread Id 24...
Operation 5 Completed in Thread Id 23
Starting operation 2 in Thread Id 27...
Starting operation 3 in Thread Id 30...
Operation 4 Completed in Thread Id 24
Starting operation 1 in Thread Id 28...
Operation 2 Completed in Thread Id 27
Operation 1 Completed in Thread Id 28
Operation 3 Completed in Thread Id 30
如您所见,任务在不同的执行线程(不同的线程 ID)中并行运行,并且依赖顺序得到保留。
13.4 用于协调共享资源(一个写入,多个读取)的并发 I/O 操作的网关
想象一下,您正在实现一个服务器应用程序,其中有很多并发客户端请求进入。这些并发请求进入服务器应用程序是因为需要访问共享数据。偶尔,需要修改共享数据的请求会进入,需要同步数据。
当新的客户端请求到达时,线程池调度一个线程来处理请求并开始处理。想象一下,如果此时请求想要以线程安全的方式更新服务器中的数据。您必须面对如何协调读写操作的问题,以便它们可以并发访问资源而不阻塞。在这种情况下,阻塞意味着协调对共享资源的访问。这样做时,写操作锁定其他操作,以获取资源的所有权,直到其操作完成。
一种可能的解决方案是使用原始锁,例如ReaderWriterLockSlim mng.bz/FY0J],它也管理对资源的访问,允许多个线程。
但在这本书中,您了解到您应该尽可能避免使用原始锁。锁阻止代码并行运行,并且在许多情况下,通过为每个请求强制创建新线程而压倒线程池。其他线程被阻止获取对相同资源的访问。另一个缺点是锁可能被持有很长时间,导致从线程池唤醒的线程处理读取请求后,立即被置于睡眠状态,等待写线程完成任务。此外,这种设计不具可扩展性。
最后,读写操作应分别处理,以便同时发生多个读取操作,因为这些操作不会改变数据。这应该通过确保写操作一次只处理一个,同时阻止读取操作检索过时数据来平衡。
您需要一个自定义协调器,该协调器可以异步同步读写操作,而不阻塞。这个协调器应该一次执行一个写操作,按顺序进行,而不阻塞任何线程,并让读取操作并行运行。
13.4.1 解决方案:对共享线程安全资源应用多次读写操作
ReaderWriterAgent提供了无阻塞的读写异步语义,并保持操作的 FIFO 顺序。它减少了资源消耗,并提高了应用程序的性能。实际上,ReaderWriterAgent可以使用很少的线程完成大量的工作。无论对ReaderWriterAgent进行的操作数量有多少,只需要很少的资源。
在接下来的示例中,您希望向共享数据库发送多个读取和写入操作。这些操作在处理时给予读取线程比写入线程更高的优先级,如图 13.7 所示。同样的概念可以应用于任何其他资源,例如文件系统。

图 13.7 ReaderWriterAgent 作为门控代理异步同步对共享资源的访问。在上图中,一次只执行一个写入操作,而读取操作则排队异步等待写入操作完成后再进行。在下图中,根据配置的并行度,并行处理多个读取操作。
列表 13.5 是使用 F# MailboxProcessor 实现 ReaderWriterAgent 的代码。选择 F# MailboxProcessor 的原因是定义状态机简单,这便于实现读取-写入异步协调器。首先,您需要定义消息类型来表示 ReaderWriterAgent 协调和同步读取和写入操作的操作。
列表 13.5 ReaderWriterAgent 协调器使用的消息类型
type ReaderWriterMsg <'r,'w> = ①
| Command of ReadWriteMessages<'r,'w>
| CommandCompleted
and ReaderWriterGateState = ②
| SendWrite
| SendRead of count:int
| Idle
and ReadWriteMessages<'r,'w> = ②
| Read of r:'r
| Write of w:'w
ReaderWriterMsg 消息类型表示对数据库进行读取或写入的命令,或者通知操作已完成。ReaderWriterGateState 是一个 DU,用于将读取/写入操作排队到 ReaderWriterAgent。最终,ReadWriteMessages DU 识别内部 ReaderWriterAgent 中排队的读取/写入操作的情况。
此列表显示了 ReaderWriterAgent 类型的实现。
列表 13.6 ReaderWriterAgent 协调异步操作
type ReaderWriterAgent<'r,'w>(workers:int,
behavior: MailboxProcessor<ReadWriteMessages<'r,'w>> ->
➥ Async<unit>,?errorHandler, ?cts:CancellationTokenSource) = ①
let cts = defaultArg cts (new CancellationTokenSource()) ②
let errorHandler = defaultArg errorHandler ignore ②
let supervisor = MailboxProcessor<Exception>.Start(fun inbox -> async {
while true do ③
let! error = inbox.Receive(); errorHandler error })
let agent = MailboxProcessor<ReaderWriterMsg<'r,'w>>.Start(fun inbox ->
let agents = Array.init workers (fun _ -> ④
(new AgentDisposable<ReadWriteMsg<'r,'w>>(behavior, cts))
.withSupervisor supervisor) ⑤
cts.Token.Register(fun () -> ⑥
agents |> Array.iter(fun agent -> (agent:>IDisposable).Dispose()))
let writeQueue = Queue<_>() ⑦
let readQueue = Queue<_>() ⑦
let rec loop i state = async {
let! msg = inbox.Receive()
let next = (i+1) % workers
match msg with
| Command(Read(req)) ->
match state with ⑧
| Idle -> agents.[i].Agent.Post(Read(req))
return! loop next (SendRead 1)
| SendRead(n) when writeQueue.Count = 0 ->
agents.[i].Agent.Post(Read(req))
return! loop next (SendRead(n+1))
| _ -> readQueue.Enqueue(req)
return! loop i state
| Command(Write(req)) -> ⑨
match state with
| Idle -> agents.[i].Agent.Post(Write(req))
return! loop next SendWrite
| SendRead(_) | SendWrite -> writeQueue.Enqueue(req)
return! loop i state
| CommandCompleted -> ⑫
match state with
| Idle -> failwith "Operation no possible"
| SendRead(n) when n > 1 -> return! loop i (SendRead(n-1))
| SendWrite | SendRead(_) ->
if writeQueue.Count > 0 then
let req = writeQueue.Dequeue()
agents.[i].Agent.Post(Write(req))
return! loop next SendWrite
elif readQueue.Count > 0 then
readQueue |> Seq.iteri (fun j req ->
agents.[(i+j)%workers].Agent.Post(Read(req)))
let count = readQueue.Count
readQueue.Clear()
return! loop ((i+ count)%workers) (SendRead count)
else return! loop i Idle }
loop 0 Idle), cts.Token)
let postAndAsyncReply cmd createRequest =
agent.PostAndAsyncReply(fun ch -> ⑬
createRequest(AsyncReplyChannelWithAck(ch, fun () ->
➥ agent.Post(CommandCompleted))) |> cmd |> ReaderWriterMsg.Command
member this.Read(readRequest) = postAndAsyncReply Read readRequest
member thisWrite(writeRequest) = postAndAsyncReply Write writeRequest
在 ReaderWriterAgent 类型中,底层 F# MailboxProcessor 的实现是一个多状态机,它协调对共享资源的独占写入和读取访问。ReaderWriterAgent 根据接收到的 ReadWriteMsg 消息类型创建子代理来访问资源。当代理协调器收到 Read 命令时,使用模式匹配检查其当前状态以应用独占访问逻辑:
-
如果状态是
Idle,则将Read命令发送到代理子代进行处理。如果没有活跃的写入操作,则主代理的状态变为SendRead。 -
如果状态是
SendRead,则将Read操作发送到代理的子代执行,前提是没有活跃的写入操作。 -
在所有其他情况下,
Read操作被放置在本地Read队列中,稍后进行处理。
当向代理协调器发送 Write 命令时,消息会被模式匹配并根据其当前状态进行处理:
-
如果状态是
Idle,则将Write命令发送到子代理的收件箱以进行处理。然后,主代理的状态变为SendWrite。 -
在所有其他情况下,
Write操作被放置在本地Write队列中,稍后进行处理。
图 13.8 展示了 ReaderWriterAgent 多状态机。

图 13.8 ReaderWriterAgent 作为一个状态机工作,其中每个状态旨在异步同步对共享资源(在这种情况下,是数据库)的访问。
以下代码片段是使用 ReaderWriterAgent 的简单示例。为了简单起见,你并不是并发访问数据库,而是在线程安全和非阻塞的方式下访问本地可变字典:
type Person = { id:int; firstName:string; lastName:string; age:int }
let myDB = Dictionary<int, Person>()
let agentSql connectionString =
fun (inbox: MailboxProcessor<_>) ->
let rec loop() = async {
let! msg = inbox.Receive()
match msg with
| Read(Get(id, reply)) ->
match myDB.TryGetValue(id) with
| true, res -> reply.Reply(Some res)
| _ -> reply.Reply(None)
| Write(Add(person, reply)) ->
let id = myDB.Count
myDB.Add(id, {person with id = id})
reply.Reply(Some id)
return! loop() }
loop()
let agent = **ReaderWriterAgent**(maxOpenConnection, agentSql connectionString)
let write person = async {
let! id = agent.Write(fun ch -> Add(person, ch))
do! Async.Sleep(100)
}
let read personId = async {
let! resp = agent.Read(fun ch -> Get(personId, ch))
do! Async.Sleep(100)
}
[ for person in people do
yield write person
yield read person.Id
yield write person
yield read person.Id
yield read person.Id ]
|> Async.Parallel
代码示例创建了 agentSql 对象,其目的是模拟访问本地资源 myDB 的数据库。ReaderWriterAgent 类的实例代理协调并行操作读取和写入,以并发和线程安全的方式访问 myDB 字典,而不阻塞。在现实世界场景中,可变集合 myDB 代表数据库、文件或任何类型的共享资源。
13.5 线程安全随机数生成器
经常在处理多线程代码时,你需要为程序中的某个操作生成随机数。例如,假设你正在编写一个需要当用户发送请求时随机发送音频片段的 Web 服务器应用程序。出于性能考虑,音频片段集被加载到服务器的内存中,该服务器正在同时接收大量请求。对于每个请求,必须随机选择一个音频片段并发送给用户播放。
在大多数情况下,System.Random 类是生成随机数值的足够快的解决方案。但是,一个有效的 Random 实例在并行访问中的应用成为一个在高性能风格中解决的有挑战性的问题。当 Random 类的实例被多个线程使用时,其内部状态可能会受损,并且它可能会始终返回零。
13.5.1 解决方案:使用 ThreadLocal 对象
ThreadLocal<T> 确保每个线程都接收其自己的 Random 类实例,即使在多线程程序中也能保证完全线程安全的访问。以下列表展示了使用 ThreadLocal<T> 类实现线程安全随机数生成器的实现,该类提供了一个强类型和局部作用域的类型,用于创建每个线程保持独立的对象实例。
列表 13.7 线程安全随机数生成器
public class ThreadSafeRandom : Random
{
private ThreadLocal<Random> random =
new ThreadLocal<Random>(() => new Random(MakeRandomSeed())); ①
public override int Next() => random.Value.Next(); ②
public override int Next(int maxValue) =>
random.Value.Next(maxValue); ②
public override int Next(int minValue, int maxValue) => ②
random.Value.Next(minValue, maxValue);
public override double NextDouble() => random.Value.NextDouble();
public override void NextBytes(byte[] buffer) => ②
random.Value.NextBytes(buffer);
static int MakeRandomSeed() =>
Guid.NewGuid().ToString().GetHashCode(); ③
}
ThreadSafeRandom 表示一个线程安全的伪随机数生成器。这个类是 Random 类的子类,并重写了 Next、NextDouble 和 NextBytes 方法。MakeRandomSeed 方法为底层 Random 类的每个实例提供一个唯一的值,这个值不依赖于系统时钟。
The constructor for `ThreadLocal<T>` accepts a `Func<T>` delegate to create a thread-local instance of the `Random` class. The `ThreadLocal<T>.Value` is used to access the underlying value. Here you access the `ThreadSafeRandom` instance from a parallel loop to simulate a concurrent environment. In this example, the parallel loop calls `ThreadSafeRandom` concurrently to obtain a random number for accessing the `clips` array: ``` var safeRandom = new **ThreadSafeRandom**(); string[] clips = new string[] { "1.mp3", "2.mp3", "3.mp3", "4.mp3"}; Parallel.For(0, 1000, (i) => { var clipIndex = safeRandom.Next(4); var clip = clips[clipIndex]; Console.WriteLine($"clip to play {clip} - Thread Id {Thread.CurrentThread.ManagedThreadId}"); }); ``` Here's the result, in print or on the console: ``` clip to play 2.mp3 - Thread Id 11 clip to play 2.mp3 - Thread Id 8 clip to play 1.mp3 - Thread Id 20 clip to play 2.mp3 - Thread Id 20 clip to play 4.mp3 - Thread Id 13 clip to play 1.mp3 - Thread Id 8 clip to play 4.mp3 - Thread Id 11 clip to play 3.mp3 - Thread Id 11 clip to play 2.mp3 - Thread Id 20 clip to play 3.mp3 - Thread Id 13 ``` ## 13.6 Polymorphic event aggregator In this section, assume that you need a tool to work in a program that requires raising several events of different types in the system, and then has a publish and subscribe system that can access these events. ### 13.6.1 Solution: implementing a polymorphic publisher-subscriber pattern Figure 13.9 illustrates how to manage events of different types. Listing 13.8 shows the `EventAggregator` implementation using Rx (in bold).  Figure 13.9 The `EventAggregator` manages events of different types. When the events are published, the `EventAggregator` matches and notifies the subscriber and events of the same type. Listing 13.8 `EventAggregator` using Rx ``` type IEventAggregator = ① inherit IDisposable ① abstract GetEvent<'Event> : unit -> IObservable<'Event> abstract Publish<'Event> : eventToPublish:'Event -> unit type internal EventAggregator() = let disposedErrorMessage = "The EventAggregator is already disposed." let subject = new **Subject**<obj>() ② interface IEventAggregator with ① member this.GetEvent<'Event>(): IObservable<'Event> = ③ if (subject.IsDisposed) then failwith disposedErrorMessage subject.OfType<'Event>().AsObservable<'Event>() ③ .SubscribeOn(TaskPoolScheduler.Default) ④ member this.Publish(eventToPublish: 'Event): unit = ⑤ if (subject.IsDisposed) then failwith disposedErrorMessage subject.OnNext(eventToPublish) ⑤ member this.Dispose(): unit = subject.Dispose() ① static member Create() = new EventAggregator():>IEventAggregator ``` The interface `IEventAggregator` helps to loosely couple the `EventAggregator` implementation. This means that the consuming code won’t need to change (as long as the interface doesn’t change), even if the inner workings of the class change. Notice that `IEventAggregator` inherits from `IDisposable` to clean up any resources that were allocated when an instance of `EventAggregator` was created. The methods `GetEvent` and `Publish` encapsulate an instance of the Rx `Subject` type, which behaves as a hub for events. `GetEvent` exposes `IObservable` from the subject instance to allow a simple way to handle event subscriptions. By default, the Rx `Subject` type is single threaded, so you use the `SubscribeOn` extension method to ensure that `EventAggregator` runs concurrently and exploits `TaskPoolScheduler`. The method `Publish` notifies all the subscribers to the `EventAggregator` concurrently. The static member `Create` generates an instance of `EventAggregator` and exposes only the single interface `IEventAggregator`. The following code example shows how to subscribe to and publish events using the `EventAggregator`, and the output of running the program: ``` let evtAggregator = EventAggregator.Create() type IncrementEvent = { Value: int } type ResetEvent = { ResetTime: DateTime } evtAggregator .GetEvent<ResetEvent>() .ObserveOn(Scheduler.CurrentThread) .Subscribe(fun evt -> printfn "Counter Reset at: %A - Thread Id %d" ➥ evt.ResetTime Thread.CurrentThread.ManagedThreadId) evtAggregator .GetEvent<IncrementEvent>() .ObserveOn(Scheduler.CurrentThread) .Subscribe(fun evt -> printfn "Counter Incremented. Value: %d - Thread ➥ Id %d" evt.Value Thread.CurrentThread.ManagedThreadId) for i in [0..10] do evtAggregator.Publish({ Value = i }) evtAggregator.Publish({ ResetTime = DateTime(2015, 10, 21) }) ``` Here’s the output: ``` Counter Incremented. Value: 0 - Thread Id 1 Counter Incremented. Value: 1 - Thread Id 1 Counter Incremented. Value: 2 - Thread Id 1 Counter Incremented. Value: 3 - Thread Id 1 Counter Incremented. Value: 4 - Thread Id 1 Counter Incremented. Value: 5 - Thread Id 1 Counter Incremented. Value: 6 - Thread Id 1 Counter Incremented. Value: 7 - Thread Id 1 Counter Incremented. Value: 8 - Thread Id 1 Counter Incremented. Value: 9 - Thread Id 1 Counter Incremented. Value: 10 - Thread Id 1 Counter Reset at: 10/21/2015 00:00:00 AM - Thread Id 1 ``` The interesting idea of the `EventAggregator` is how it handles events of different types. In the example, the `EventAggregator` instance registers two different event types (`IncrementEvent` and `ResetEvent`), and the `Subscribe` function sends the notification by targeting only the subscribers for a specific event type. ## 13.7 Custom Rx scheduler to control the degree of parallelism Let’s imagine you need to implement a system for querying large volumes of event streams asynchronously, and it requires a level of concurrency control. A valid solution for composing asynchronous and event-based programs is Rx, which is based on observables to generate sequence data concurrently. But as discussed in chapter 6, Rx isn’t multithreaded by default. To enable a concurrency model, it’s necessary to configure Rx to use a scheduler that supports multithreading by invoking the `SubscribeOn` extension. For example, Rx provides a few scheduler options including the `TaskPool` and `ThreadPool` types, which schedule all the actions to take place potentially using a different thread. But there’s a problem, because both schedulers start with one thread by default and then have a time delay of about 500 ms before they’ll increase the number of threads required on demand. This behavior can have performance-critical consequences. For example, consider a computer with four cores where there are eight actions scheduled. The Rx thread pool, by default, starts with one thread. If each action takes 2.000 ms, then three actions are queued up waiting for 500 ms before the Rx scheduler thread pool’s size is increased. Consequently, instead of executing four actions in parallel right away, which would take 4 seconds in total for all eight actions, the work isn’t completed for 5.5 sec, because three of the tasks are idle in the queue for 500 ms. Fortunately, the cost of expanding the thread pool is only a one-time penalty. In this case, you need a custom Rx scheduler that supports concurrency with fine control over the level of parallelism. It should initialize the internal thread pool at startup time rather than when needed to avoid the cost during critical time computation. If you enable concurrency in Rx using one of the available schedulers, there’s no option to configure the max degree of parallelism. This is a limitation, because in certain circumstances you only want few threads to be concurrently processing the event stream. ### 13.7.1 Solution: implementing a scheduler with multiple concurrent agents The Rx `SubscribeOn` extension method requires passing as an argument an object that implements the `IScheduler` interface. The interface defines the methods responsible for scheduling the action to be performed, either as soon as possible or at a point in the future. You can build a custom scheduler for Rx that supports the concurrency model with the option of configuring a degree of parallelism, shown in figure 13.10.  Figure 13.10 `ParallelAgentScheduler` is a custom scheduler that aims to tailor the concurrent behavior of the Rx. The Rx scheduler uses an agent to coordinate and manage the parallelism. This is achieved by a pool of agent workers that push notifications to subscribers. The following listing shows the implementation of the `ParallelAgentScheduler` scheduler for Rx, which uses the agent `parallelWorker` (shown in Listing 11.5) to manage the degree of parallelism (the code to note is in bold). Listing 13.9 Rx custom scheduler for managing the degree of parallelism ``` type ScheduleMsg = ScheduleRequest * AsyncReplyChannel<IDisposable> ① let schedulerAgent (inbox:MailboxProcessor<ScheduleMsg>) = ② let rec execute (queue:IPriorityQueue<ScheduleRequest>) = async { match queue |> PriorityQueue.tryPop with ③ | None -> return! idle queue -1 | Some(req, tail) -> let timeout = int <| (req.Due - DateTimeOffset.Now).TotalMilliseconds if timeout > 0 && (not req.IsCanceled) then return! idle queue timeout else if not req.IsCanceled then req.Action.Invoke() return! execute tail } and idle (queue:IPriorityQueue<_>) timeout = async { ④ let! msg = inbox.TryReceive(timeout) let queue = match msg with | None -> queue | Some(request, replyChannel)-> replyChannel.Reply(Disposable.Create(fun () -> ⑤ request.IsCanceled <- true)) queue |> PriorityQueue.insert request return! execute queue } idle (PriorityQueue.empty(false)) -1 type ParallelAgentScheduler(workers:int) = let agent = MailboxProcessor<ScheduleMsg> ⑥ .**parallelWorker**(workers, schedulerAgent) interface IScheduler with ⑦ member this.Schedule(state:'a, due:DateTimeOffset, ➥ action:ScheduledAction<'a>) = agent.PostAndReply(fun repl -> ⑧ let action () = action.Invoke(this :> IScheduler, state) let req = ScheduleRequest(due, Func<_>(action)) req, repl) member this.Now = DateTimeOffset.Now member this.Schedule(state:'a, action) = let scheduler = this :> IScheduler let due = scheduler.Now scheduler.Schedule(state, due, action) member this.Schedule(state:'a, due:TimeSpan, action:ScheduledAction<'a>) = let scheduler = this :> IScheduler let due = scheduler.Now.Add(due) scheduler.Schedule(state, due, action) ``` `ParallelAgentScheduler` introduces a level of concurrency to schedule and perform the tasks pushed in a distributed pool of running agents (F# `MailboxProcessor`). Note that all actions sent to `ParallelAgentScheduler` can potentially run out of order. `ParallelAgentScheduler` can be used as an Rx scheduler by injecting a new instance into the `SubscribeOn` extension method. The following code snippet is a simple example to use this custom scheduler: ``` let scheduler = **ParallelAgentScheduler**(4) Observable.Interval(TimeSpan.FromSeconds(0.4)) .SubscribeOn(scheduler) .Subscribe(fun _ -> printfn "ThreadId: %A " Thread.CurrentThread.ManagedThreadId) ``` The instance scheduler of the `ParallelAgentScheduler` object is set to have four concurrent agents running and ready to react when a new notification is pushed. In the example, the observable operator `Interval` sends a notification every 0.4 seconds, which is handled concurrently by the underlying agents of the `parallelWorker`. The benefits of using this custom `ParallelAgentScheduler` scheduler is that there’s no downtime and delay in creating new threads, and it provides fine control over the degree of parallelism. There are times, for example, when you’ll want to limit the level of parallelism for analyzing an event stream, such as when events waiting to be processed are buffered in the internal queue of the underlying agents and consequently not lost. ## 13.8 Concurrent reactive scalable client/server The challenge: You need to create a server that listens asynchronously on a given port for incoming requests from multiple TCP clients. Additionally, you want the server to be * Reactive * Able to manage a large number of concurrent connections * Scalable * Responsive * Event driven These requirements ensure that you can use functional high-order operations to compose the event stream operations over the TCP socket connections in a declarative and non-blocking way. Next, the client requests need to be processed concurrently by the server, with resulting responses sent back to the client. The Transmission Control Protocol (TCP) server connection can be either secured or unsecured. TCP is the most-used protocol on the internet today, used to provide accurate delivery that preserves the order of data packets from one endpoint to another. TCP can detect when packets are wrong or missing, and it manages the action necessary for resending them. Connectivity is ultra-important in applications, and the .NET Framework provides a variety of different ways to help you support that need. You also need a long-running client program that uses TCP sockets to connect to the server. After the connection is established, both the client and server endpoints can send and receive bytes asynchronously and sometimes close the connection properly and reopen it at a later time. The client program that attempts to connect to the TCP server is asynchronous, non-blocking, and capable of maintaining the application’s responsiveness, even under pressure (from sustaining a large number of data transfers). For this example, the client/server socket-based application continually transfers volumes of packets at a high rate of speed as soon as the connection is established. The data is transmitted from the server to the client streaming in chunks, where each chunk represents the historical stocks prices on a particular date. This stream of data is generated by reading and parsing the comma-separated values (CSV) files in the solution. When the client receives the data, it begins to update a chart in real time. This scenario is applicable to any operations that use reactive programming based on streams. Examples you may encounter are remote binary listeners, socket programming, and any other unpredictable event-oriented application, such as when a video needs to be streamed across the network. ### 13.8.1 Solution: combining Rx and asynchronous programming To build the client/server program shown in Listing 13.10, the CLR `TcpListener` and `TcpClient` classes provide a convenient model for creating a socket server with a few code lines. Used in combination with TAP and Rx, they increase the level of scalability and reliability of the program. But to work in the reactive style, the traditional application design must change. Specifically, to achieve the requirements of a high-performing TCP client/server program, you need to implement the TCP sockets in an asynchronous style. For this reason, consider using a combination of Rx and TAP. Reactive programming, in particular, fits this scenario because it can deal with source events from any stream regardless of its type (network, file, memory, and so on). Here’s the Rx definition from Microsoft: > *The Reactive Extensions (Rx) is a library for composing asynchronous and event-based programs using observable sequences and LINQ-style query operators, and parameterize the concurrency in the asynchronous data streams using Schedulers.* To implement the server in a scalable way, the instance of the `TcpListener` class listens for incoming connections. When a connection is established, it’s routed, as a `TcpClient`, from the listener handler to manage the `NetworkStream`. This stream is then used for reading and writing bytes for data-sharing between client and server. Figure 13.11 shows the connection logic of the server program.  Figure 13.11 The `TcpListener` server accepts client connections asynchronously in a loop. When a connection is established, the event that carries the client stream is pushed through the `Observable` pipeline to be processed. Next, the connection handlers start reading the stock ticker symbol histories, serialize, and write the data to the client `NetworkStream`. Listing 13.10 Reactive `TcpListener` server program ``` static void ConnectServer(int port, string sslName = null) { var cts = new CancellationTokenSource(); string[] stockFiles = new string[] { "aapl.csv", "amzn.csv", "fb.csv", ➥ "goog.csv", "msft.csv" }; ① var formatter = new BinaryFormatter(); ② TcpListener.Create(port) .ToAcceptTcpClientObservable() ③ .ObserveOn(TaskPoolScheduler.Default) ④ .Subscribe(client => { using (var stream = GetServerStream(client, sslName)) ⑤ { stockFiles .ObservableStreams(StockData.Parse) ⑥ .Subscribe(async stock => { var data = Serialize(formatter, stock); ⑦ await stream.WriteAsync(data, 0, data.Length, cts.Token); ⑦ }); } }, error => Console.WriteLine("Error: " + error.Message), ⑧ () => Console.WriteLine("OnCompleted"), ⑧ cts.Token); } ``` In the example, the server shows the implementation of a reactive TCP listener that acts as an observable of the stock ticker. The natural approach for a listener is to subscribe to an endpoint and receive clients as they connect. This is achieved by the extension method `ToAcceptTcpClientObservable`, which produces an observable of the `IObservable`<`TcpClient`>. The `ConnectServer` method uses the `TcpListener`.`Create` construct to generate a `TcpListener` using a given port number on which the server is listening asynchronously, and an optional name of the Secure Sockets Layer (SSL) to establish a secure or regular connection. The custom observable extension method `ToAcceptTcpClientObservable` uses the given `TcpListener` instance to provide mid-level network services across an underlying socket object. When a remote client becomes available and a connection is established, a `TcpClient` object is created to handle the new communication, which is then sent into a different long-running thread with the use of a `Task` object. Next, to guarantee the concurrent behavior of the socket handler, the scheduler is configured using the `ObserveOn` operator to subscribe and move the work to another scheduler, `TaskPoolScheduler`. In this way, the `ToAcceptTcpClientObservable` operator can orchestrate a large number of `TcpClient`s concurrently as a sequence. Then, the internals of the observable `ToAcceptTcpClientObservable` fetch the `TcpClient` reference from the task, and create the network stream used as a channel to send the packets of data generated by the `ObservableStreams` custom observable operator. The `GetServerStream` method retrieves either a secure or regular stream according to the value that `nameSsl` passed. This method determines whether the `nameSsl` value for an SSL connection has been set and, if so, creates an `SslStream` using `TcpClient.GetStream` and the configured server name to get the server certificate. Alternatively, if SSL isn’t used, `GetServerStream` gets the `NetworkStream` from the client using the `TcpClient`.`GetStream` method. You can find the `GetServerStream` method in the source code. When the `ObservableStreams` materialize, the event stream that’s generated flows into the `Subscribe` operator. The operator then asynchronously serializes the incoming data into chunks of byte arrays that are sent across the network through the client stream. For simplicity, the serializer is the .NET binary formatter, but you can replace it with one that better fits your needs. The data is sent across the network in the form of byte arrays, because it’s the only reusable data message type that can contain any shape of object. This listing shows an implementation of the core observable operator `ToAcceptTcpClientObservable` used by the underlying `TcpListener` to listen for remote connections and react accordingly. Listing 13.11 Asynchronous and reactive `ToAcceptTcpClientObservable` ``` static IObservable<TcpClient> ToAcceptTcpClientObservable(this TcpListener ➥ listener, int backlog = 5) { listener.Start(backlog); ① return Observable.Create<TcpClient>(async (observer, token) => ② { try { while (!token.IsCancellationRequested) ③ { var client = await listener.AcceptTcpClientAsync(); ④ Task.Factory.StartNew(_ => observer.OnNext(client), token, TaskCreationOptions.LongRunning); ⑤ } observer.OnCompleted(); } catch (OperationCanceledException) { observer.OnCompleted(); ⑥ } catch (Exception error) { observer.OnError(error); ⑥ } finally { listener.Stop(); } return Disposable.Create(() => ⑦ { listener.Stop(); listener.Server.Dispose(); }); }); } ``` `ToAcceptTcpClientObservable` takes an instance of `TcpListener`, which starts listening asynchronously for new incoming connection requests in a `while` loop, until the operation is canceled using a cancellation token. When a client successfully connects, a `TcpClient` reference flows out as a message within a sequence. This message executes into an asynchronous `Task` to service the client/server interaction, letting multiple clients connect concurrently to the same listener. Once a connection is accepted, another `Task` starts repeating the procedure of listening for new connection request. Ultimately, when the observable is disposed, or the cancellation token requests a cancellation, the function passed into the `Disposable`.`Create` operator is triggered to stop and close the underlying server listener. The data transferred is generated through the `ObservableStreams` extension method, which reads and parses a set of CSV files to extract the historical stocks prices. This data is then pushed to the clients, connected through the `NetworkStream`. This shows the implementation of `ObservableStreams`. Listing 13.12 Custom `Observable` stream reader and parser ``` static IObservable<StockData> ObservableStreams (this IEnumerable<string> filePaths, ➥ Func<string, string, StockData> map, int delay = 50) ① { return filePaths .Select(key => new FileLinesStream<StockData>(key, row => map(key, row))) ② .Select(fsStock => { var startData = new DateTime(2001, 1, 1); return Observable.Interval(TimeSpan.FromMilliseconds(delay)) ③ .Zip(fsStock.ObserveLines(), (tick, stock) => { ④ stock.Date = startData + TimeSpan.FromDays(tick); return stock; }); } ) .Aggregate((o1, o2) => o1.Merge(o2)); ⑤ } ``` `ObservableStreams` generates a series of observables of `StockData` type, one for each of the `filePaths` passed. The class `FileLinesStream`, whose implementation is omitted for simplicity, opens the `FileStream` of a given file path. It then reads the content text from the stream as an observable and applies a projection to transform each line of text read into a `StockData` type. Ultimately it pushes the results out as an observable. The most interesting part of the code is the application of the two `Observable` operators `Interval` and `Zip`, which are used together to apply an arbitrary delay, if specified, between messages. The `Zip` operator combines an element from each sequence in turn, which means that each `StockData` entry is paired with an element, produced every interval time. In this case, the combination of a `StockData` with the interval time ensures a delay for each notification. Ultimately, the combination of the `Aggregate` and `Merge` operators is used to merge the observables generated from each file: ``` .Aggregate((o1, o2) => o1.Merge(o2)); ``` Next, to complete the client/server program, you implement the reactive client class, shown in figure 13.12. Listing 13.13 shows the implementation of the client side.  Figure 13.12 `TcpClient` requests a connection to the `TcpListener` server. When the connection is established, it triggers an event that carries the client stream, which is pushed through the observable pipeline. Next, a `NetworkStream` is created to start reading the data asynchronously in a loop from the server. The data read is next deserialized and analyzed through the observable pipeline to ultimately update the live chart. Listing 13.13 Reactive `TcpClient` program ``` var endpoint = new IPEndPoint(IPAddress.Parse("127.0.0.1"), 8080); var cts = new CancellationTokenSource(); var formatter = new BinaryFormatter(); endpoint.ToConnectClientObservable() ① .Subscribe(client => { GetClientStream(client, sslName) ② .ReadObservable(0x1000, cts.Token) ③ .Select(rawData => Deserialize<StockData>(formatter, rawData)) .GroupBy(item => item.Symbol) ④ .SelectMany(group => group.Throttle(TimeSpan.FromMilliseconds(20)) ⑤ .ObserveOn(TaskPoolScheduler.Default)) ⑥ .ObserveOn(ctx) .Subscribe(stock => ⑦ UpdateChart(chart, stock, sw.ElapsedMilliseconds) ); }, error => Console.WriteLine("Error: " + error.Message), () => Console.WriteLine("OnCompleted"), cts.Token); ``` The code starts with an `IPEndPoint` instance, which targets the remote server endpoint to connect. The observable operator `ToConnectClientObservable` creates an instance of a `TcpClient` object to initiate the connection. Now, you can use the `Observable` operators to subscribe to the remote client connection. When the connection with the server is established, the `TcpClient` instance is passed as an observable to begin receiving the stream of data to process. In this implementation, the remote `NetworkStream` is accessed calling the `GetClientStream` method. The stream of data flows into the observable pipeline though the `ReadObservable` operator, which routes the incoming messages from the underlying `TcpClient` sequence into another observable sequence of type `ArraySegment` bytes. As part of the stream-processing code, after the chunks of `rawData` received from the server are converted into `StockData`, the `GroupBy` operator filters the stock tickers by symbol into multiple observables. At this point, each observable can have its own unique operations. Grouping allows throttling to act independently on each stock symbol, and only stocks with identical symbols will be filtered within the given throttle time span. A common problem with writing reactive code is when the events come in too quickly. A fast-moving stream of events can overwhelm your program’s processing. In listing 13.13, because you have a bunch of UI updates, using the throttling operator can help deal with a massive flood of stream data without overwhelming the live updates. The operator after the throttling, `ObserveOn`(`TaskPoolScheduler`.`Default`), starts a new thread for each partition originated by the `GroupBy`. The `Subscribe` method ultimately updates the live chart with the stock values. Here’s the implementation of the `ToConnectClientObservable` operator. Listing 13.14 Custom Observable `ToConnectClientObservable` operator ``` static IObservable<TcpClient> ToConnectClientObservable(this IPEndPoint ➥ endpoint) { return Observable.Create<TcpClient>(async (observer, token) => { ① var client = new TcpClient(); try { await client.ConnectAsync(endpoint.Address, endpoint.Port); ② token.ThrowIfCancellationRequested(); ③ observer.OnNext(client); ④ } catch (Exception error) { observer.OnError(error); } return Disposable.Create(() => client.Dispose()); ⑤ }); } ``` `ToConnectClientObservable` creates an instance of `TcpClient` from the given `IPEndPoint` endpoint, and then it tries to connect asynchronously to the remote server. When the connection is established successfully, the `TcpClient` client reference is pushed out through the observer. The last phase of the code to program is the `ReadObservable` observable operator, which is built to asynchronously and continuously read chunks of data from a stream. In this program, the stream is the `NetworkStream` produced as result of the connection between the server and client. Listing 13.15 Observable stream reader ``` public static IObservable<ArraySegment<byte>> ReadObservable(this Stream ➥ stream, int bufferSize, CancellationToken token = ➥ default(CancellationToken)) { var buffer = new byte[bufferSize]; var asyncRead = Observable.FromAsync<int>(async ct => { ① await stream.ReadAsync(buffer, 0, sizeof(int), ct); ② var size = BitConverter.ToInt32(buffer, 0); ② await stream.ReadAsync(buffer, 0, size, ct); ② return size}); return Observable.While( ③ () => !token.IsCancellationRequested && stream.CanRead, Observable.Defer(() => ④ !token.IsCancellationRequested && stream.CanRead ? asyncRead : Observable.Empty<int>()) .Catch((Func<Exception, IObservable<int>>)(ex => ➥ Observable.Empty<int>())) ⑤ .TakeWhile(returnBuffer => returnBuffer > 0) .Select(readBytes => ➥ new ArraySegment<byte>(buffer, 0, readBytes))) ⑥ .Finally(stream.Dispose); } ``` One important note to consider when implementing this `ReadObservable` is that the stream must be read in chunks to be reactive. That’s why the `ReadObservable` operator takes a buffer size as an argument to define the size of the chunks. The purpose of the `ReadObservable` operator is to read a stream in chunks to facilitate working with data that’s larger than the memory available, or that could be infinite with an unknown size, like streaming from the network. In addition, it promotes the compositional nature of Rx for applying multiple transformations to the stream itself, because reading chunks at a time allows data transformations while the stream is still in motion. At this point, you have an extension method that iterates on the bytes from a stream. In the code, the `FromAsync` extension method allows you to convert a `Task<T>`, in this case `stream`.`ReadAsync`, into an `IObservable<T>` to treat the data as a flow of events and to enable programming with Rx. Underneath, `Observable.FromAsync` creates an observable that only starts the operation independently every time it’s subscribed to. Then, the underlying stream is read as an `Observable` `while` loop until data is available or the operation is canceled. The `Observable` `Defer` operator waits until an observer subscribes to it, and then starts pushing the data as a stream. Next, during each iteration, a chunk of data is read from the stream. This data is then pushed into a buffer that takes the form of an `ArraySegment`<`byte`>, which slices the payload in the right length. `ReadObservable` returns an `IObservable` of `ArraySegment`<`byte`>, which is an efficient way to manage the byte arrays in a pool. The buffer size may be larger than the payload of bytes received, for example, so the use of `ArraySegment`<`byte`> holds the byte array and payload length. In conclusion, when receiving and processing data, the .NET Rx allows shorter and cleaner code than traditional solutions. Furthermore, the complexity of building a TCP-based reactive client/server program is heavily reduced in comparison to a traditional model. In fact, you don’t have to deal with low-level `TcpClient` and `TcpListener` objects, and the flow of bytes is handled through a high-level abstraction offered by observable operators. ## 13.9 Reusable custom high-performing parallel filter‑map operator You have a collection of data, and you need to perform the same operation on each element of the data to satisfy a given condition. This operation is CPU-bound and may take time. You decide to create a custom and reusable high-performant operator to filter and map the elements of a given collection. The combination of filtering and transforming the elements of a collection is a common operation for analyzing data structures. It’s possible to achieve a solution using LINQ or PLINQ in parallel with the `Where` and `Select` operators; but a more optimal performance solution is available. As you saw in section 5.2.1, for each call and repeated use of high-order operators such as map (`Select`), filter (`Where`), and other similar functions of the PLINQ query (and LINQ), as shown in figure 13.13, intermediate sequences are generated that unnecessarily increase memory allocation. This is due to the intrinsic functional nature of LINQ and PLINQ, where collections are transformed instead of mutated. In the case of transforming large sequences, the penalty paid to the GC to free up memory becomes increasingly higher, with negative consequences to the performance of the program.  Figure 13.13 In this diagram, each number (first column) is first filtered by `IsPrime` (second column) to verify if it’s a prime number. Then, the prime numbers are passed into the `ToPow` function (third column). For example, the first value, number 1, is not a prime number, so the `ToPow` function isn’t running. In this example, you want to derive the sum of all the prime numbers in 100 million digits. ### 13.9.1 Solution: combining filter and map parallel operations The implementation of a custom and parallel filter and map operator with top performance requires attention to minimize (or eliminate) unnecessary temporary data allocation, as shown in figure 13.14. This technique of reducing data allocation during data manipulation to increase the performance of the program is known as *deforestation*.  Figure 13.14 The left graph shows the operations `Where` and `Select` over a given source, done in separate steps, which introduces extra memory allocation and consequentially more GC generations. The right graph shows that applying the `Where` and `Select` (filter and map) operations together in a single step avoids extra allocation and reduces GC generations, increasing the speed of the program. The next listing shows the code of the `ParallelFilterMap` function, which uses the `Parallel.ForEach` loop to eliminate intermediate data allocations by processing only one array, instead of creating one temporary collection for each operator. Listing 13.16 `ParallelFilterMap` operator ``` static TOutput[] ParallelFilterMap<TInput, TOutput>(this IList<TInput> ➥ input, Func<TInput, Boolean> predicate, ① Func<TInput, TOutput> transform, ParallelOptions parallelOptions = null) { parallelOptions = parallelOptions ?? new ParallelOptions(); var atomResult = new Atom<ImmutableList<List<TOutput>>> ② (ImmutableList<List<TOutput>>.Empty); Parallel.ForEach(Partitioner.Create(0, input.Count), parallelOptions, () => new List<TOutput>(), ③ delegate (Tuple<int, int> range, ParallelLoopState state, List<TOutput> localList) { for (int j = range.Item1; j < range.Item2; j++) ④ { var item = input[j]; if (predicate(item)) ⑤ localList.Add(transform(item)); ⑤ } return localList; }, localList => atomResult.Swap(r => r.Add(localList))); ⑥ return atomResult.Value.SelectMany(id => id).ToArray(); ⑦ } ``` The parallel `ForEach` loop applies the `predicate` and `map` functions for each element of the input collection. In general, if the body of the parallel loop performs only a small amount of work, better performance results come from partitioning the iterations into larger units of work. The reason for this is the overhead when processing a loop, which involves the cost of managing worker threads and the cost of invoking a delegate method. Consequently, it’s good practice to partition the parallel iteration space by a certain constant using the `Partitioner`.`Create` constructor. Then each body invokes the filter and map functions for a certain range of elements, amortizing invocations of the loop body delegate. For each iteration of the `ForEach` loop, there’s an anonymous delegate invocation that causes a penalty in terms of memory allocation and, consequently, performance. One invocation occurs for the filter function, a second invocation occurs for the map function, and ultimately an invocation happens for the delegate passed into the parallel loop. The solution is to tailor the parallel loop specific to the filter and map operations to avoid extra invocations of the body delegate. The parallel `ForEach` operator forks off a set of threads, each of which calculates an intermediate result by performing the filter and map functions over its own partition of data and placing the value into its dedicated slot in the intermediate array. Each thread (task) governed by the parallel loop captures an isolated instance of a local `List<TOutput>` through the concept of local values. Local values are variables that exist locally within a parallel loop. The body of the loop can access the value directly, without having to worry about synchronization. Each partition will compute its own intermediate value that will then combine into a single final value. When the loop completes, and it’s ready to aggregate each of its local results, it does so with the `localFinally` delegate. But the delegate requires synchronization accessto the variable that holds the final result. An instance of the `ImmutableList` collection is used to overcome this limitation to merge the final results in a thread-safe manner. Note the `ImmutableList` is encapsulated in an `Atom` object, from chapter 3\. The `Atom` object uses a compare-and-swap (CAS) strategy to apply thread-safe writes and updates of objects without the need of locks and other forms of primitive synchronization. In this example, the `Atom` class holds a reference to the immutable list and updates it automatically. The following code snippet tests the parallel sum of only the prime numbers from 100 million digits: ``` bool IsPrime(int n) { if (n == 1) return false; if (n == 2) return true; var boundary = (int) Math.Floor(Math.Sqrt(n)); for (int i = 2; i <= boundary; ++i) if (n % i == 0) return false; return true; } BigInteger ToPow(int n) => (BigInteger) Math.BigMul(n, n); var nums = Enumerable.Range(0, 100000000).ToList(); BigInteger SeqOperation() => nums.Where(IsPrime).Select(ToPow).Aggregate(BigInteger.Add); BigInteger ParallelLinqOperation() => nums.AsParallel().Where(IsPrime).Select(ToPow).Aggregate(BigInteger.Add); BigInteger ParallelFilterMapInline() => nums.**ParallelFilterMap**(IsPrime, ToPow).Aggregate(BigInteger.Add); ``` Figure 13.15 compares the sequential code (as baseline), the PLINQ version, and the custom `ParallelFilterMap` operator. The figure shows the result of the benchmark code running the sum of the prime numbers for the 100 million digits. The benchmark was performed in a quad-core machine with 6 GB of RAM. The sequential code takes an average of 196.482 seconds to run and is used as baseline. The PLINQ version of the code is faster and runs in 74.926 seconds, almost three times faster, which is expected in a quad-core computer. The custom `ParallelFilterMap` operator is the fastest, at approximately 52.566 seconds.  Figure 13.15 Benchmark chart comparing the Sequential and Parallel LINQ versions of the code with the custom `ParallelFilterMap` operator. In a quad-core machine, the custom `ParallelFilterMap` operator is approximately 80% faster than the sequential version of the code, and 30% faster than the PLINQ version. ## 13.10 Non-blocking synchronous message-passing model Let’s imagine you need to build a scalable program capable of handling a large number of operations without blocking any threads. You need a program that loads, processes, and saves a large number of images, for example. These operations are handled with few threads in a collaborative way, which optimizes the resources without blocking any threads and without jeopardizing the performance of the program. Similar to the Producer/Consumer pattern, there are two flows of data. One flow is the input, where the processing starts, followed by intermediate steps to transform the data, followed by the output with the final result of the operations. These processes, the producer and the consumer, share a common fixed-size buffer used as a queue. The queue is buffered to increase the overall speed and increase throughput to allow for multiple consumers and producers. In fact, when the queue is safe to use by multiple consumers and producers, then it’s easy to change the level of concurrency for different parts of the pipeline at runtime. The producer, however, could write into the queue when it isn’t full, or conversely, it can block when the queue is full. On the other side, the consumer could read from the queue when it is not empty, but it will block in other cases when the queue is empty. You want to implement a producer and consumer pattern based on message passing to avoid thread blocking and maximize the application’s scalability. ### 13.10.1 Solution: coordinating the payload between operations using the agent programming model There are two flavors of message passing models for concurrent systems: synchronous and asynchronous. You’re already familiar with asynchronous models such as the agent (and actor) model, explained in chapters 11 and 12, and based on asynchronous message passing. In this recipe, you’ll use the synchronous version of message passing, which is also known as communicating sequential processes (CSP). CSP has much in common with the actor model, both being based on message passing. But CSP emphasizes the channels used for communication, rather than the entities between which communication takes place. This CSP synchronous message passing for concurrent programming models is used for data exchange between channels, which can be scheduled to multiple threads and might run in parallel. Channels are similar to thread workers that communicate directly with each other by publishing messages, and where other channels can then listen for these messages without the sender knowing who’s listening. You can imagine the channel as a thread-safe queue, where any task with a reference to a channel can add messages to one end, and any task with a reference to it can remove messages from the other end. Figure 13.16 illustrates the channel model.  Figure 13.16 The channel receives (`Recv`) a message, and applies the subscribed behavior. The channels communicate by sending (`Send`) messages, often creating an interconnected system that’s similar to the actor model. Each channel contains a local queue of messages used to synchronize the communication with other channels without blocking. A channel doesn’t need to know about what channel will process the message later in the pipeline. It only has to know what channel to forward the messages to. On the other side, listeners on channels can subscribe and unsubscribe without affecting any channels sending the messages. This design promotes loose coupling between channels. The primary strength of CSP is its flexibility, where channels are first-class and can be independently created, written to, read from, and passed between tasks. The following listing shows the implementation of the channel in F#, which uses `MailboxProcessor` for the underlying message synchronization due to the close similarity with the agent-programming model. The same concepts apply to C#. You can find the full implementation in C# using TDF in the downloadable source code. Listing 13.17 `ChannelAgent` for CSP implementation using `MailboxProcessor` ``` type internal ChannelMsg<'a> = ① | Recv of ('a -> unit) * AsyncReplyChannel<unit> | Send of 'a * (unit -> unit) * AsyncReplyChannel<unit> type [<Sealed>] ChannelAgent<'a>() = let agent = MailboxProcessor<ChannelMsg<'a>>.Start(fun inbox -> let readers = Queue<'a -> unit>() ② let writers = Queue<'a * (unit -> unit)>() ② let rec loop() = async { let! msg = inbox.Receive() match msg with | Recv(ok , reply) -> ③ if writers.Count = 0 then readers.Enqueue ok reply.Reply( () ) else let (value, cont) = writers.Dequeue() TaskPool.Spawn cont ④ reply.Reply( (ok value) ) return! loop() | Send(x, ok, reply) -> ⑤ if readers.Count = 0 then writers.Enqueue(x, ok) reply.Reply( () ) else let cont = readers.Dequeue() TaskPool.Spawn ok ④ reply.Reply( (cont x) ) return! loop() } loop()) member this.Recv(ok: 'a -> unit) = ⑤ agent.PostAndAsyncReply(fun ch -> Recv(ok, ch)) |> Async.Ignore member this.Send(value: 'a, ok:unit -> unit) = ⑤ agent.PostAndAsyncReply(fun ch -> Send(value, ok, ch)) |> Async.Ignore member this.Recv() = ⑥ Async.FromContinuations(fun (ok, _,_) -> agent.PostAndAsyncReply(fun ch -> Recv(ok, ch)) |> Async.RunSynchronously) member this.Send (value:'a) = ⑥ Async.FromContinuations(fun (ok, _,_) -> agent.PostAndAsyncReply(fun ch -> Send(value, ok, ch)) |> Async.RunSynchronously ) let run (action:Async<_>) = action |> Async.Ignore |> Async.Start ⑦ let rec subscribe (chan:ChannelAgent<_>) (handler:'a -> unit) = ⑧ chan.Recv(fun value -> handler value subscribe chan handler) |> run ``` The `ChannelMsg` DU represents the message type that `ChannelAgent` handles. When a message arrives, the `Recv` case is used to execute a behavior applied to the payload passed. The `Send` case is used to communicate a message to the channel. The underlying `MailboxProcessor` contains two generic queues, one for each operation, `Recv` or `Send`. As you can see, when a message is either received or sent, the behavior of the agent, in the function `loop()`, checks the count of available messages to load balance and synchronize the communication without blocking any threads. `ChannelAgent` accepts continuation functions with its `Recv` and `Send` operations. If a match is available, the continuation is invoked immediately; otherwise, it’s queued for later. Keep in mind that a synchronous channel eventually gives a result, so the call is logically blocking. But when using F# async workflows, no actual threads are blocked while waiting. The last two functions in the code help run a channel operation (usually `Send`), while the `subscribe` function is used to register and apply a handler to the messages received. This function runs recursively and asynchronously waiting for messages from the channel. The `TaskPool.Spawn` function assumes a function with signature `(unit -> unit) -> unit` that forks the computation on a current thread scheduler. This listing shows the implementation of `TaskPool`, which uses the concepts covered in chapter 7. Listing 13.18 Dedicated `TaskPool` agent (`MailboxProcessor``)` ````` ```` ``` type private Context = {cont:unit -> unit; context:ExecutionContext} ① type TaskPool private (numWorkers) = ② let worker (inbox: MailboxProcessor<Context>) = ③ let rec loop() = async { let! ctx = inbox.Receive() let ec = ctx.context.CreateCopy() ④ ExecutionContext.Run(ec, (fun _ -> ctx.cont()), null) return! loop() } loop() let agent = MailboxProcessor<Context>.parallelWorker(numWorkers, ➥ worker) ⑤ static let self = TaskPool(2) member private this.Add (continutaion:unit -> unit) = ⑥ let ctx = { cont = continutaion; context = ExecutionContext.Capture() } ⑦ agent.Post(ctx) ⑥ static member Spawn (continuation:unit -> unit) = self.Add continuation ``` The `Context` record type is used to capture the `ExecutionContext` at the moment when the continuation function `cont` was passed to the pool. `TaskPool` initializes the `MailboxProcessor` `parallelWorker` type to handle multiple concurrent consumers and producers (refer to chapter 11 for the implementation and details of the `parallelWorker` agent). The purpose of `TaskPool` is to control how many tasks to schedule and to dedicate to run the continuation function in a tight loop. In this example, it runs only one task, but you can have any number. `Add` enqueues the given continuation function, which will be executed when a thread on a channel offers communication and another thread offers matching communication. Until such compensation between channels is achieved, the thread will wait asynchronously. In this code snippet, the `ChannelAgent` implements a CSP pipeline, which loads an image, transforms it, and then saves the newly created image into the local `MyPicture` folder: ``` let rec subscribe (chan:ChannelAgent<_>) (handler:'a -> unit) = chan.Recv(fun value -> handler value subscribe chan handler) |> run let chanLoadImage = ChannelAgent<string>() let chanApply3DEffect = ChannelAgent<ImageInfo>() let chanSaveImage = ChannelAgent<ImageInfo>() subscribe chanLoadImage (fun image -> let bitmap = new Bitmap(image) let imageInfo = { Path = Environment.GetFolderPath(Environment.SpecialFolder.MyPictures) Name = Path.GetFileName(image) Image = bitmap } chanApply3DEffect.Send imageInfo |> run) subscribe chanApply3DEffect (fun imageInfo -> let bitmap = convertImageTo3D imageInfo.Image let imageInfo = { imageInfo with Image = bitmap } chanSaveImage.Send imageInfo |> run) subscribe chanSaveImage (fun imageInfo -> printfn "Saving image %s" imageInfo.Name let destination = Path.Combine(imageInfo.Path, imageInfo.Name) imageInfo.Image.Save(destination)) let loadImages() = let images = Directory.GetFiles(@".\Images") for image in images do chanLoadImage.Send image |> run loadImages() ``` As you can see, implementing a CSP-based pipeline is simple. After you define the channels `chanLoadImage`, `chanApply3DEffect`, and `chanSaveImage`, you have to register the behaviors using the `subscribe` function. When a message is available to be processed, the behavior is applied. ## 13.11 Coordinating concurrent jobs using the agent programming model The concepts of parallelism and asynchronicity were covered extensively earlier in this book. Chapter 9 shows how powerful and convenient the `Async.Parallel` operator is for running a large number of asynchronous operations in parallel. Often, however, you may need to map across a sequence of asynchronous operations and run functions on the elements in parallel. In this case, a feasible solution can be implemented: ``` let inline asyncFor(operations: #seq<'a> Async, map:'a -> 'b) = Async.map (Seq.map map) operations ``` Now, how would you limit and tame the degree of parallelism to process the elements to balance resource consumption? This issue comes up surprisingly often when a program is doing CPU-heavy operations, and there’s no reason to run more threads than the number of processors on the machine. When there are too many concurrent threads running, contention and context-switching make the program enormously inefficient, even for a few hundred tasks. This is a problem of throttling. How can you throttle asynchronous and CPU-bound computations awaiting results without blocking? The challenge becomes even more difficult because these asynchronous operations are spawned at runtime, which makes the total number of asynchronous jobs to run unknown. ### 13.11.1 Solution: implementing an agent that runs jobs with a configured degree of parallelism The solution is using an agent model to implement a job coordinator that lets you throttle the degree of parallelism by limiting the number of tasks that are processed in parallel, as shown in figure 13.17. In this case, the agent’s only mission is to gate the number of concurrent tasks and send back the result of each operation without blocking. In addition, the agent should conveniently expose an observable channel where you can register to receive notifications when a new result is computed.  Figure 13.17 The `TamingAgent` runs the jobs in parallel, limited by the degree of parallelism configured. When an operation completes, the `Subscribe` operator notifies the registered handlers with the output of the job. Let’s define the agent that can tame the concurrent operations. The agent must receive a message, but must also send back to the caller, or subscriber, a response for the result computed. In the following listing, the implementation of the `TamingAgent` runs asynchronous operations, efficiently throttling the degree of parallelism. When the number of concurrent operations exceeds this degree, they’re queued and processed later. Listing 13.19 `TamingAgent` ``` type JobRequest<'T, 'R> = ① | Ask of 'T * AsyncReplyChannel<'R> | Completed | Quit type TamingAgent<'T, 'R>(limit, operation:'T -> Async<'R>) = let jobCompleted = new Event<'R>() ② let tamingAgent = Agent<JobRequest<'T, 'R>>.Start(fun agent -> let dispose() = (agent :> IDisposable).Dispose() ③ let rec running jobCount = async { ④ let! msg = agent.Receive() match msg with | Quit -> dispose() | Completed -> return! running (jobCount - 1) ⑤ | Ask(job, reply) -> ⑥ do! async { try let! result = operation job ⑦ jobCompleted.Trigger result ⑧ reply.Reply(result) ⑨ finally agent.Post(Completed) } ⑩ |> Async.StartChild |> Async.Ignore ⑪ if jobCount <= limit - 1 then return! running (jobCount + 1) else return! idle () } and idle () = ⑫ agent.Scan(function ⑬ | Completed -> Some(running (limit - 1)) | _ -> None) running 0) ⑭ member this.Ask(value) = tamingAgent ⑮ .PostAndAsyncReply(fun ch -> Ask(value, ch)) member this.Stop() = tamingAgent.Post(Quit) member x.Subscribe(action) = jobCompleted.Publish |> ➥Observable.subscribe(action) ⑯ ``` The `JobRequest` DU represents the message type for the agent `tamingAgent`. This message has a `Job` case, which handles the value to send to compute and a reply channel with the result. The `Completed` case is used by an agent to notify when a computation is terminated and the next job available can be processed. Ultimately, the `Quit` case (message) is sent to stop the agent when needed. The `TamingAgent` constructor takes two arguments: the concurrent execution limit and the asynchronous operation for each job. The body of the `TamingAgent` type relies on two mutually recursive functions to track the number of concurrently running operations. When the agent starts with zero operations, or the number of running jobs doesn’t pass the limit of the degree of parallelism imposed, the function running will wait for a new incoming message to process. Conversely, when the jobs running reach the enforced limit, the execution flow of the agent switches the function to idle. It uses the `Scan` operator to wait for a type of message to discharge the others. The `Scan` operator is used in the F# `MailboxProcessor` (agent) to process only a subset and targeted type of messages. The `Scan` operator takes a lambda function that returns an `Option` type. The messages you want to be found during the scanning process should return `Some`, and the messages you want to ignore for now should return `None`. The operation signature passed into the constructor is `'T -> Async<'R>`, which resembles the `Async.map` function. This function is applied to each job that’s sent to the agent through the method member `Ask`, which takes a value type that is passed to the agent to initiate, or queue, a new job. When the computation completes, the subscribers of the underlying event `jobCompleted` are notified with the new result, which is also replied back asynchronously to the caller that sent the message across the channel `AsyncReplyChannel`. As mentioned, the purpose of the event `jobCompleted` is to notify the subscribers that have registered a callback function through the method member `Subscribe`, which uses the `Observable` module for convenience and flexibility. Here’s how the `TamingAgent` is used to transform a set of images. This example is similar to the CPS `Channel` one, allowing you to compare code between different approaches. Listing 13.20 `TamingAgent` in action for image transformation ``` let loadImage = (fun (imagePath:string) -> async { let bitmap = new Bitmap(imagePath) return { Path = Environment.GetFolderPath(Environment.SpecialFolder.MyPictures) Name = Path.GetFileName(imagePath) Image = bitmap } }) ① let apply3D = (fun (imageInfo:ImageInfo) -> async { let bitmap = convertImageTo3D imageInfo.Image return { imageInfo with Image = bitmap } }) ② let saveImage = (fun (imageInfo:ImageInfo) -> async { printfn "Saving image %s" imageInfo.Name let destination = Path.Combine(imageInfo.Path, imageInfo.Name) imageInfo.Image.Save(destination) return imageInfo.Name}) ③ let loadandApply3dImage (imagePath:string) = Async.retn imagePath >>= loadImage >>= apply3D >>= saveImage ④ let loadandApply3dImageAgent = TamingAgent<string, string>(2, loadandApply3dImage) ⑤ loadandApply3dImageAgent.Subscribe(fun imageName -> printfn "Saved image %s ➥ from subscriber" imageName) ⑥ let transformImages() = ⑦ let images = Directory.GetFiles(@".\Images") for image in images do loadandApply3dImageAgent.Ask(image) |> run (fun imageName -> printfn "Saved image %s - from reply back" imageName) ``` The three asynchronous functions, `loadImage`, `apply3D`, and `saveImage`, are composed together, forming the function `loadandApply3dImage` using the F# async `bind` infix operator `>>=` defined in chapter 9\. As a refresher, here’s the implementation: ``` let bind (operation:'a -> Async<'b>) (xAsync:Async<'a>) = async { let! x = xAsync return! operation x } let (>>=) (item:Async<'a>) (operation:'a -> Async<'b>) = bind operation item ``` Then, the `loadandApply3dImageAgent` instance of the `TamingAgent` is defined by passing the argument limit into the constructor. This sets the degree of parallelism of the agent and the argument function `loadandApply3dImage`, which represents the behavior for the job computations. The `Subscribe` function registers a callback that runs when each job completes. In this example, it displays the name of the image of the completed job. The `loadImages()` function reads the image paths from the directory Images, and in a `for-each` loop sends the values to the `loadandApply3dImageAgent` `TamingAgent`. The `run` function uses CPS to execute a callback when the result is computed and replied back. ## 13.12 Composing monadic functions You have functions that take a simple type and return an elevated type like `Task` or `Async`, and you need to compose those functions. You might think you need to get the first result and next apply it to the second function, and then repeat for all the functions. This process can be rather cumbersome. This is a case for employing the concept of function composition. As a reminder, you can create a new function from two smaller ones. It usually works, as long as the functions have matching output and input types. This rule doesn’t apply for monadic functions, because they don’t have matching input/output types. For example, monadic `Async` and `Task` functions cannot be composed because `Task<T>` isn’t the same as `T`. Here’s the signature for the monadic `Bind` operator: ``` Bind : (T -> Async<R>) -> Async<T> -> Async<R> Bind : (T -> Task<R>) -> Task <T> -> Task <R> ``` The `Bind` operator can pass elevated values into functions that handle the wrapped underlying value. How can you compose monadic functions effortlessly? ### 13.12.1 Solution: combining asynchronous operations using the Kleisli composition operator The composition between monadic functions is named *Kleisli composition*, and in FP it’s usually represented with the infix operator `>=>` that can be constructed using the monadic `Bind` operator. The `Kleisli` operator essentially provides a composition construct over monadic functions, which instead of composing regular functions like `a -> b` and `b -> c`, is used to compose over `a -> M b` and `b -> M c`, where `M` is an elevated type. The signature of the `Kleisli` composition operator for elevated types, such as the `Async` and `Task` types, is ``` Kleisli (>=>) : ('T -> Async<TR>) -> (TR -> Async<R>) -> T -> Async<R> Kleisli (>=>) : ('T -> Task<TR>) -> (TR -> Task <R>) -> T -> Task <R> ``` With this operator, two monadic functions can compose directly as follows: ``` (T -> Task<TR>) >=> (TR -> Task<R>) (T -> Async<TR>) >=> (TR -> Async<R>) ``` The result is a new monadic function: ``` T -> Task<R> T -> Async<R> ``` The next code snippet shows the implementation of the `Kleisli` operator in C#, which uses the monadic `Bind` operator underneath. The operator `Bind` (or `SelectMany`) for the `Task` type was introduced in chapter 7: ``` static Func<T, Task<U>> **Kleisli**<T, R, U>(Func<T, Task<R>> task1, Func<R, Task<U>> task2) => **async** value => **await** task1(value).Bind(task2); ``` The equivalent function in F# can also be defined using the conventional `kleisli` infix operator `>=>`, in this case applied to the `Async` type: ``` let **kleisli** (f:'a -> Async<'b>) (g:'b -> Async<'c>) (x:'a) = (f x) >>= g let **(>=>)** (f:'a -> Async<'b>) (g:'b -> Async<'c>) (x:'a) = (f x) >>= g ``` The `Async` `bind` and `infix` operator `>>=` was introduced in chapter 9\. Here’s the implementation as a reminder: ``` let **bind** (operation:'a -> Async<'b>) (xAsync:Async<'a>) = async { let! x = xAsync return! operation x } let **(>>=)**(item:Async<'a>) (operation:'a -> Async<'b>) = bind operation item ``` Let’s see where and how the `Kleisli` operator can help. Consider the case of multiple asynchronous operations that you want to compose effortlessly. These functions have the following signature: ``` operationOne : ('a -> Async<'b>) operationTwo : ('b -> Async<'c>) operationThree : ('c -> Async<'d>) ``` Conceptually, the composed function would look like: ``` (‘a -> Async<’b>) -> (‘b -> Async<’c>) -> (‘c -> Async<’d>) ``` At a high level, you can think of this composition over monadic functions as a pipeline, where the result of the first function is piped into the next one and so on until the last step. In general, when you think of piping, you can think of two approaches: applicative (`<*>`) and monadic (`>>=`). Because you need the result of the previous call in your next call, the monadic style (`>>=`) is the better choice. For this example, you use the `TamingAgent` from the previous recipe. The `TamingAgent` has the method member `Ask`, whose signature matches the scenario, where it takes a generic argument `'T` and returns an `Async<'R>` type. At this point, you use the `Kleisli` operator to compose a set of `TamingAgent` types to form a pipeline of agents, as shown in figure 13.18. The result of each agent is computed independently and passed as input, in the form of a message, to the next agent until the last node of the chain performs the final side effect. The technique of linking and composing agents can lead to robust designs and concurrent systems. When an agent returns (replies back) a result to the caller, it can be composed into a pipeline of agents.  Figure 13.18 The pipeline processing pattern is useful when you want to process data in multiple steps. The idea behind the pattern is that inputs are sent to the first agent in the pipeline. The main benefit of the pipeline processing pattern is that it provides a simple way to balance the tradeoff between overly sequential processing (which may reduce performance) and overly parallel processing (which may have a large overhead). This listing shows the `TamingAgent` composition in action. The example is a rework of Listing 13.20, which reuses the same function for loading, transforming, and saving an image. Listing 13.21 `TamingAgent` with the `Kleisli` operator ``` let pipe limit operation job : Async<_> = ① let agent = TamingAgent(limit, operation) agent.Ask(job) let loadImageAgent = pipe 2 loadImage ② let apply3DEffectAgent = pipe 2 apply3D ② let saveImageAgent = pipe 2 saveImage ② let pipeline = loadImageAgent >=> apply3DEffectAgent >=> saveImageAgent ③ let transformImages() = ④ let images = Directory.GetFiles(@".\Images") for image in images do pipeline image |> run (fun imageName -> printfn "Saved image %s" imageName) ``` In this example, the program uses the `TamingAgent` to transform an image, different than Listing 13.20. In earlier recipes, the three functions that load, transform, and save, in that order, an image to the local filesystem are composed together to form a new function. This function is handled and applied to all the incoming messages by a single instance of `TamingAgent` type. In this application (Listing 13.21), an instance of `TamingAgent` is created for each function to run, and then the agents are composed through the underlying method `Ask` to form a pipeline. The `Ask` asynchronous function ensures a reply to the caller through the `AsyncReplyChannel` when the job completes. The composition of the agents is eased by the `Kleisli` operator. The purpose of the `pipe` function is to help create an instance of the `TamingAgent` and expose the function `Ask`, whose signature `'a -> Async<'b>` resembles the monadic `Bind` operator used for the composition with other agents. After the definition of the three agents, `loadImageAgent`, `apply3DEffectAgent`, and `saveImageAgent`, using the `pipe` helper function, it becomes simple to create a pipeline by composing these agents using the `Kleisli` operator. ## Summary * You should use a concurrent object pool to recycle instances of the same objects without blocking to optimize the performance of a program. The number of GC generations can be dramatically reduced by using a pool of objects, which improves the speed of a program’s execution. * You can parallelize a set of dependent tasks with a constrained order of execution. This process is useful because it maximizes parallelism as much as possible among the execution of multiple tasks, regardless of dependency. * Multiple threads can coordinate the access of shared resources for reader-writer types of operations without blocking, maintaining a FIFO ordering. This coordination allows the read operations to run simultaneously, while asynchronously (non-blocking) waiting for eventual write operations. This pattern increases the performance of an application due to introduction of parallelism and the reduced consumption of resources. * An event aggregator acts similar to the mediator design pattern, where all events go through a central aggregator and can be consumed from anywhere in the application. Rx allows you to implement an event aggregator that supports multithreading to handle multiple events concurrently. * You can implement a custom Rx scheduler using the `IScheduler` interface, to allow the taming of incoming events with a fine control over the degree of parallelism. Furthermore, by explicitly setting the level of parallelism, the Rx scheduler internal thread pool isn’t penalized with downtime for expanding the size of threads when required. * Even without built-in support for the CSP programming model in .NET, you can use either the F# `MailboxProcessor` or TDF to coordinate and balance the payload between asynchronous operations in a non-blocking synchronous message-passing style. ```` `````* ````` ````# 14 Building a scalable mobile app with concurrent functional programming **This chapter covers** * Designing scalable, performant applications * Using the CQRS pattern with WebSocket notifications * Decoupling an ASP.NET Web API controller using Rx * Implementing a message bus Leading up to this chapter, you learned about and mastered concurrent functional techniques and patterns for building highly performant and scalable applications. This chapter is the culmination and practical application of those techniques, where you use your knowledge of TPL tasks, asynchronous workflow, message-passing programming, and reactive programming with reactive extensions to develop a fully concurrent application. The application you’re building in this chapter is based on a mobile interface that communicates with a Web API endpoint for real-time monitoring of the stock market. It includes the ability to send commands to buy and sell stocks and to maintain those orders using a long-running asynchronous operation on the server side. This operation reactively applies the trade actions when the stocks reach the desired price point. Discussion points include architecture choice and explanation of how the functional paradigm fits well in both the server and client sides of a system when designing a scalable and responsive application. By the end of this chapter, you’ll know how to design optimal concurrent functional patterns and how to choose the most effective concurrent programming model. ## 14.1 Functional programming on the server in the real world A server-side application must be designed to handle multiple requests concurrently. In general, conventional web applications can be thought of as embarrassingly parallel, because requests are entirely isolated and easy to execute independently. The more powerful the server running the application, the higher the number of requests it can handle. The program logic of modern, large-scale web applications is inherently concurrent. Additionally, highly interactive modern web and real-time applications, such as multiplayer browser games, collaborative platforms, and mobile services are a huge challenge in terms of concurrency programming. These applications use instant notifications and asynchronous messaging as building blocks to coordinate the different operations and communicate between different concurrent requests that likely run in parallel. In these cases, it’s no longer possible to write a simple application with a single sequential control flow; instead, you must plan for the synchronization of independent components in a holistic manner. You might ask, why should you use FP when building a server-side application? In September 2013, Twitter published the paper “Your Server as a Function” (Marius Eriksen, [`monkey.org/~marius/funsrv.pdf`](https://monkey.org/~marius/funsrv.pdf)). Its purpose was to validate the architecture and programming model that Twitter adopted for building server-side software on a large scale, where systems exhibit a high degree of concurrency and environmental variability. The following is a quote from the paper: > We present three abstractions around which we structure our server software at Twitter. They adhere to the style of functional programming—emphasizing immutability, the composition of first-class functions, and the isolation of side effects—and combine to present a large gain in flexibility, simplicity, ease of reasoning, and robustness. The support provided for concurrent FP in .NET is key to making it a great tool for server-side programming. Support exists for running operations asynchronously in a declarative and compositional semantic style; additionally, you can use agents to develop thread-safe components. You can combine these core technologies for declarative processing of events and for efficient parallelism with the TPL. Functional programming facilitates the implementation of a stateless server (figure 14.1), which is an important asset for building scalability when architecting large web applications required to handle a huge number of request concurrently, such as social networks or e-commerce sites. A program is stateless when the operations (such as functions, methods, and procedures) aren’t sensitive to the state of the computation. Consequently, all the data used in an operation is passed as *inputs* to the operation, and all the data used by the operations invoked is passed back as *outputs*. A stateless design never stores application or user data for later computational needs. The stateless design eases concurrency, because it’s easy for each stage of the application to run on a different thread. The stateless design is the key that makes the design able to scale out perfectly according to Amdahl’s Law. In practice, a stateless program can be effortlessly parallelized and distributed among computers and processes to scale out performance. You don’t need to know where the computation runs, because no part of the program will modify any data structures, which avoids data races. Also, the computation can run in different processes or different computers without being constrained to perform in a specific environment.  Figure 14.1 Server with state (stateful) compared to a server without state (stateless). The stateful server must keep the state between requests, which limits the scalability of the system, requiring more resources to run. The stateless server can auto-scale because there’s no sharing of state. Before stateless servers, there can be a load balancer that distributes the incoming requests, which can be routed to any machine without worrying about hitting a particular server. Using FP techniques, you can build sophisticated, fully asynchronous and adaptive systems that auto-scale using the same level of abstractions, with the same semantics, across all dimensions of scale, from CPU cores to data centers. ## 14.2 How to design a successful performant application When processing hundreds of thousands of requests simultaneously per second in a large-scale setting, you need a high degree of concurrency and efficiency in handling I/O and synchronization to ensure maximum throughput and CPU use in server software. Efficiency, safety, and robustness are paramount goals that have traditionally conflicted with code modularity, reusability, and flexibility. The functional paradigm emphasizes a declarative programming style, which forces asynchronous programs to be structured as a set of components whose data dependencies are witnessed by the various asynchronous combinators. When implementing a program, you should bake performance goals into the design up front. Performance is an aspect of software design that cannot be an afterthought; it must be included as an explicit goal from the start. It’s not *impossible* to redesign an existing application from the ground up, but it’s far more expensive than designing it correctly in the first place. ### 14.2.1 The secret sauce: ACD You want a system capable of flexing to an increase (or decrease) of requests with a commensurate boost in parallel speedup with the addition of resources. The secret ingredients for designing and implementing such a system are asynchronicity, caching, and distribution (ACD): * *Asynchronicity* refers to an operation that completes in the future rather than in real time. You can interpret asynchronicity as an architectural design—queuing work that can be completed later to smooth out the processing load, for example. It’s important to decouple operations so you do the minimal amount of work in performance-critical paths. Similarly, you can use asynchronous programming to schedule requests for nightly processes. * *Caching* aims to avoid repeating work. For example, caching saves the results of earlier work that can be used again later, without repeating the work performed to get those results. Usually, caching is applied in front of time-consuming operations that are frequently repeated and whose output doesn’t change often. * *Distribution* aims to partition requests across multiple systems to scale out processing. It’s easier to implement distribution in a stateless system: the less state the server holds, the easier it is to distribute work. ACD is a main ingredient for writing scalable and responsive applications that can maintain high throughput under a heavy workload. That’s a task that’s becoming increasingly vital. ### 14.2.2 A different asynchronous pattern: queuing work for later execution At this point, you should have a clear idea of what asynchronous programming means. Asynchronicity, as you recall, means you dispatch a job that will complete in the *future*. This can be achieved using two patterns. The first is based on continuation passing style (CPS), or callbacks, discussed in chapters 8 and 9\. The second is based on asynchronous message passing, covered in chapters 11 and 12\. As mentioned in the previous section, asynchronicity can also be the result (behavior) of a design in an application. The pattern in figure 14.2 implements asynchronous systems at a design level, aiming to smooth the workload of the program by sending the operations, or requests to do work, to a service that queues tasks to be completed in the future. The service can be in a remote hardware device, a remote server in a cloud service, or a different process in a local machine. In the latter case, the execution thread sends the request in a fire-and-forget fashion, which releases it for further work at a later time. An example of a task that uses this design is scheduling a message to be sent to a mailing list.  Figure 14.2 The work is passed to a queue, and the remote worker picks up the message and performs the requested action later in the future. When the operation completes, it can send a notification to the origin (sender) of the request with details of the outcome. Figure 14.2 shows six steps: 1. The execution thread sends the job or request to the service, which queues it. The task is picked up and stored to be performed in the future. 2. At some point, the service grabs the task from the queue and dispatches the work to be processed. The processing server is responsible for scheduling a thread to run the operation. 3. The scheduled thread runs the operation, likely using a different thread per task. 4. Optimally, when the work is completed, the service notifies the origin (sender) that the work is completed. 5. While the request is processed in the background, the execution thread is free to perform other work. 6. If something goes wrong, the task is rescheduled (re-queued) for later execution. Initially, online companies invested in more powerful hardware to accommodate the increased volume of requests. This approach proved to be a pricey option, considering the associated costs. In recent years, Twitter, Facebook, StackOverflow.com, and other companies have proven that it’s possible to have a quick, responsive system with fewer machines through the use of good software design and patterns such as ACD. ## 14.3 Choosing the right concurrent programming model Increasing the performance of a program using concurrency and parallelism has been at the center of discussion and research for many years. The result of this research has been the emergence of several concurrency programming models, each with its own strengths and weaknesses. The common theme is a shared ambition to perform and offer characteristics to enable faster code. In addition to these concurrency programming models, companies have developed tools to assist such programming: Microsoft created the TPL and Intel incorporated Threading Building Blocks (TBB) to produce high-quality and efficient libraries to help professional developers build parallel programs. There are many concurrency programming models that vary in their task interaction mechanisms, task granularities, flexibility, scalability, and modularity. After years of experience in building high-scalable systems, I’m convinced that the right programing model is a combination of programming models tailored to each part of your system. You might consider using the actor model for message-passing systems, and PLINQ for data parallelism computation in each of your nodes, while downloading data for pre-computation analysis by using non-blocking I/O asynchronous processing. The key is finding the right tool or combination of tools for the job. The following list represents my choice for concurrent technology based on common cases: * In the presence of pure functions and operations with well-defined control dependencies, where the data can be partitioned or operate in a recursive style, consider using TPL to establish a dynamic task parallel computation in the form of either the Fork/Join or Divide and Conquer pattern. * If a parallel computation requires preserving the order of the operations, or the algorithm depends on logical flow, then consider using a DAG with either the TPL task primitive or the agent model (see chapter 13). * In the case of a sequential loop, where each iteration is independent and there are no dependencies among the steps, the TPL Parallel Loop can speed up performance by computing the data in simultaneous operations running in separate tasks. * In the case of processing data in the form of a combination operator, for example by filtering and aggregating the input elements, Parallel LINQ (PLINQ) is likely a good solution to speed up computation. Consider a parallel reducer (also called a fold or aggregate), such as the parallel `Aggregator` function, for merging the results and using the `Map-Reduce` pattern. * If the application is designed to perform a sequence of operations as a workflow, and if the order of execution for a set of tasks is relevant and must be respected, then use either the Pipeline or Producer/Consumer pattern; these are great solutions for parallelizing the operations effortlessly. You can easily implement these patterns using either the TPL Dataflow or F# `MailboxProcessor`. Keep in mind when building deterministic parallel programs that you can build them from the bottom up by composing deterministic parallel patterns of computation and data access. It’s recommended that parallel patterns should provide control over the granularity of their execution, expanding and contracting the parallelism based on the resources available. In this section, you’ll build an application that simulates an online stock market service (figure 14.3). This service periodically updates stock prices and pushes the updates to all connected clients in real time. This high-performance application can handle huge numbers of simultaneous connections inside a web server.  Figure 14.3 UI of the mobile (Apple iPad) stock market example. The panel on the left side provides stock price updates in real time. The panel on the right is used to manage the portfolio and set trade orders for buying and selling stocks. The client is a mobile application, an iOS app for iPhone built using Xamarin and Xamarin.Forms. In the mobile client, the values change in real time in response to notifications from the server. Users of the application can manage their own portfolio by setting orders to buy and/or sell a specific stock when it reaches a predetermined price. In addition to the mobile application, a WPF version of the client program is provided in the downloadable source code. As you build your application, you’ll take a closer look at how to apply functional concurrency to such an application. You'll combine this knowledge with concurrent functional techniques and patterns presented in previous chapters. You’ll use the Command and Query Responsibility Segregation (CQRS) pattern, Rx, and asynchronous programming to handle parallel requests. You’ll include event sourcing based on functional persistence (that is, an event store using the agent-programming model), and more. I explain these patterns later with the pertinent part of the application. The web server application is an ASP.NET Web API that uses Rx to push the messages originated by the incoming requests from the controller to other components of the application. These components are implemented using agents (F# `MailboxProcessor`) that spawn a new agent for each established and active user connection. In this way, the application can be maintained in an isolated state per user, and provide an easy opportunity for scalability. The mobile application is built in C#, which is, in general, a good choice for client-side development in combination with the TAP model and Rx. Instead of C#, for the web-server code you’ll use F#; but you can find the C# version of the program in the source code of this book. The primary reason for choosing F# for the server-side code is immutability as a default construct, which fits perfectly in the stateless architecture used in the stock market example. Also, the built-in support for the agent programming model with the F# `MailboxProcessor` can encapsulate and maintain state effortlessly in a thread-safe manner. Furthermore, as you’ll see shortly, F# represents a less-verbose solution compared to C# for implementing the CQRS pattern, making the code explicit and capturing what happens in a function without hidden side effects. The application uses ASP.NET SignalR to provide server broadcast functionality for real-time updates. *Server broadcast* refers to a communication initiated by the server and then sent to clients. ### 14.3.1 Real-time communication with SignalR Microsoft’s SignalR library provides an abstraction over some of the transports that are required to push server-side content to the connected clients as it happens in real time. This means that servers and their clients can push data back and forth in real time, establishing a bidirectional communication channel. SignalR takes advantage of several transports, automatically selecting the best available transport given the client and server. The connection starts as HTTP and is then promoted to a WebSocket connection if available. WebSocket is the ideal transport for SignalR since it makes the most efficient use of server memory, has the lowest latency, and has the greatest number of underlying features. If these requirements aren’t met, SignalR falls back, attempting to use other transports to make its connections, such as Ajax long polling. SignalR will always try to use the most efficient transport and will keep falling back until it selects the best one that’s compatible with the context. This decision is made automatically during an initial stage in the communication between the client and the server, known as *negotiation*. ## 14.4 Real-time trading: stock market high-level architecture Before diving into the code implementation of the stock market application, let’s review the high-level architecture of the application so you have a good handle on what you’re developing. The architecture is based on the CQRS pattern, which enforces the separation between domain layers and the use of models for reading and writing. The key tenet of CQRS is to separate *commands*, which are operations that cause state change (side effects in the system), from *query* requests that provide data for read-only activities without changing the state of any object, as shown in figure 14.4. The CQRS patterns are also based on the *separation of concerns*, which is important in all aspects of software development and for solutions built on message-based architectures.  Figure 14.4 The CQRS pattern enforces the separation between domain layers and the use of models for reading and writing. To maximize the performance of the read operations, the application can benefit from a separate data storage optimized specifically for queries. Often, such storage might be a NoSQL database. The synchronization between the read/write storage instances is performed asynchronously in the background mode, and can take some time. Such data storages are considered to be eventually consistent. The benefits of using the CQRS pattern include adding the ability to manage more business complexity while making your system easier to scale down, the ability to write optimized queries, and simplifying the introduction of the caching mechanism by wrapping the read portion of the API. Employing CQRS in the case of systems with a massive disparity between the workload of writes and reads allows you to drastically scale the read portion. Figure 14.5 shows the diagram of the stock market web-server application based on the CQRS pattern. You can think about this functional architecture as a dataflow architecture. Inside the application, data flows through various stages. In each step, the data is filtered, enriched, transformed, buffered, broadcast, persisted, or processed any number of ways. The steps of the flow shown in figure 14.5 are as follows: 1. The user sends a request to the server. The request is shaped as a command to set a trading order to buy or sell a given stock. The ASP.NET Web API controller implements the `IObservable` interface to expose the `Subscribe` method, which registers observers that listen to the incoming request. This design transforms the controller in a message publisher, which sends the command to the subscribers. In this example, there’s only one subscriber, an agent (`MailboxProcessor`) that acts as a message bus. But there could be any number of subscribers, for example for logging and performance metrics. 2. The incoming requests into the Web API actions are validated and transformed into a system command, which is wrapped into an envelope that enriches it with metadata such as a timestamp and a unique ID. This unique ID, which usually is represented by the SignalR connection ID, is used later to store the events aggregated by a unique identifier that’s user-specific, which simplifies the targeting and execution of potential queries and replaying event histories.  Figure 14.5 A representative model of the stock market web-server application, which is based on the CQRS pattern. The commands (`Write`s) are pushed across the application pipeline to perform the trading operations in a different channel than the queries (`Read`s). In this design, the queries (`Read`s) are performed automatically by the system in the form of server notifications, which are broadcast to the clients through SignalR connections. Imagine SignalR as the channel that allows the client to receive the notifications generated from the server. In the callouts it’s specified as the technologies used to implement the specified component. 1. The command is passed into a command handler, which pushes the message to subscribers through a message bus. The subscriber of the command handler is the `StockTicker`, an object implemented using an agent to maintain the state, as the name implies, of the stock market tickers. 2. The `StockTicker` and `StockMarket` types have established a bidirectional communication, which is used to notify about stock price updates. In this case, Rx is used to randomly and constantly update the stock prices that are sent to the `StockMarket` and then flow to the `StockTicker`. The SignalR hub then broadcasts the updates to all the active client connections. 3. The `StockTicker` sends the notification to the `TradingCoordinator` object, which is an agent that maintains a list of active users. When a user registers with the application, the `TradingCoordinator` receives a notification and spawns a new agent if the user is new. The application server creates a new agent instance for each incoming request that represents a new client connection. The `TradingCoordinator` object implements the `IObservable` interface, which is used to establish a reactive publisher-subscriber with Rx to send the messages to the registered observers, the `TradingAgent`s. 4. The `TradingCoordinator` receives the commands for trading operations, and dispatches them to the associated agent (user), verifying the unique client connection identifier. The `TradingAgent` type is an agent that implements the `IObserver` interface, which is registered to receive the notification from the `IObservable``TradingCoordinator`. There’s a `TradingAgent` for each user, and the main purpose is to maintain the state of a portfolio and the trading orders to buy and sell the stocks. This object is continuously receiving stock market updates to verify whether any of the orders in its state satisfy the criteria to trigger the trading operation. 5. The application implements event sourcing to store the trading events. The events are per group—by user and ordered by timestamp. Potentially, the history of each user can be replayed. 6. When a trade is triggered, the `TradingAgent` notifies the client’s mobile application through SignalR. The application’s objective is to have the client sending the trade orders and waiting asynchronously for a notification when each operation is completed. The application diagram in figure 14.5 is based on the CQRS pattern, with a clear separation between the reads and the writes. It’s interesting to note that real-time notifications are enabled for the query side (reads), so the user doesn’t need to send a request to retrieve updates. Going back to the CQRS pattern diagram from figure 14.4, which is repeated in figure 14.6, you can see that there are two separate storages: one for the read and one for the write. Designing storage separation in this way, using the CQRS pattern, is recommended to maximize performance of the read operations. In the case of two detached storages, the write side must update the read side. This synchronization is performed asynchronously in the background mode and can take time, so the read data storage is considered to be eventually consistent.  Figure 14.6 The CQRS pattern *Eventual consistency* is a consistency model used in distributed computing to achieve high availability, guaranteeing that eventually all accesses to that item will return the last updated value. In the stock market application, however, the eventual consistency is automatically handled by the system. The users will receive the updates and latest values through real-time notifications when the data changes. This is possible due to the SignalR bidirectional communication between the server and the clients, which is a convenient mechanism because users don’t have to ask for updates, the server will provide updates automatically. ## 14.5 Essential elements for the stock market application You haven’t yet learned several essential elements for the stock market application because it’s assumed you’ve already encountered the topics. I’ll briefly review these items and include where you can continue your study as needed. The first essential element is F#. If you have a shallow background in F#, see appendix B for information and summaries that you might find useful. The server-side application is based on the ASP.NET Web API, which requires knowledge of that technology. For the client side, the mobile application uses Xamarin and Xamarin.Forms with the *Model-View-ViewModel* (MVVM) pattern for data binding; but you don’t need to have any particular knowledge of these frameworks. Throughout the rest of this chapter, you’re going to use the following: * Reactive Extensions for .NET * Task Parallel Library * F# `MailboxProcessor` * Asynchronous workflows The same concepts applied in the following code examples are relevant for all the .NET programming languages. ## 14.6 Let’s code the stock market trading application This section covers the code examples to implement the real-time mobile stock market application with trading capabilities, as shown in figure 14.7. The parts of the program that aren’t relevant or strictly important with the objective of the chapter are intentionally omitted. But you can find the full functional implementation in the downloadable source code.  Figure 14.7 The architecture diagram of the stock market web-server application. This is a high-level diagram compared to figure 14.5, which aims to clarify the components of the application. Note that each component, other than `Validation` and `Command`, is implemented using a combination of Rx, the `IObservable` and `IObserver` interfaces, and the agent-programming model. Let’s start with the server Web API controller, where the client mobile application sends the requests to perform the trading operations. Note that the controller represents the write domain of the CQRS pattern; in fact, the actions are HTTP POST only, as shown here (the code to note is in bold). Listing 14.1 Web API trading controller ``` [<RoutePrefix("api/trading")>] type TradingController() = inherit ApiController() let subject = new **Subject**<CommandWrapper>() ① let publish connectionId cmd = ⑦ match cmd with | Result.Ok(cmd) -> ② CommandWrapper.Create connectionId cmd ③ **subject.OnNext** ① | Result.Error(e) -> subject.OnError(exn (e)) ② cmd let toResponse (request : HttpRequestMessage) result = match result with | Ok(_) -> request.CreateResponse(HttpStatusCode.OK) | _ -> request.CreateResponse(HttpStatusCode.BadRequest) ④ [<Route("sell"); HttpPost>] member this.PostSell([<FromBody>] tr : TradingRequest) = async { let connectionId = tr.ConnectionID ⑤ return { Symbol = tr.Symbol.ToUpper() Quantity = tr.Quantity Price = tr.Price Trading = TradingType.Sell } |> tradingdValidation ⑥ |> **publish** connectionId ⑦ |> toResponse this.Request ④ } |> **Async.StartAsTask** ⑧ interface **IObservable**<CommandWrapper> with member this.**Subscribe** observer = subject.Subscribe observer ① override this.Dispose disposing = if disposing then subject.Dispose() base.Dispose disposing ⑨ ``` The Web API controller `TradingController` exposes the sell (`PostSell`) and the buy (`PostBuy`) actions. Both these actions have an identical code implementation with different purposes. Only one is presented in the listing, to avoid repetition. Each action control is built around two core functions, validate and publish. `tradingdValidation` is responsible for validating messages per-connection because they’re received from the client. `publish` is responsible for publishing the messages to the control subscribers for core processing. The `PostSell` action validates the incoming request through the `tradingValidation` function, which returns either `Result`.`Ok` or `Result`.`Error` according to the validity of its input. Then, the output from the validation function is wrapped into a command object using the `CommandWrapper.Create` function and published to the subscribed observers `subject.OnNext`. The `TradingController` uses an instance of the `Subject` type, from the Rx library, to act as an observable by implementing the `IObservable` interface. In this way, this controller is loosely coupled and behaves as a Publisher/Subscriber pattern, sending the commands to the observers that are registered. The registration of this controller as an `Observable` is plugged into the Web API framework using a class that implements the `IHttpControllerActivator`, as shown here (the code to note is in bold). Listing 14.2 Registering a Web API controller as `Observable` ``` type ControlActivatorPublisher(requestObserver:**IObserver**<CommandWrapper>) = interface IHttpControllerActivator with ① member this.Create(request, controllerDescriptor, controllerType) = if controllerType = typeof<TradingController> then ② let obsController = let tradingCtrl = new TradingController() tradingCtrl |> **Observable**.**subscribeObserver** requestObserver ② |> request.RegisterForDispose tradingCtrl obsController :> IHttpController ② else raise (ArgumentException("Unknown controller type requested")) ``` The `ControlActivatorPublisher` type implements the interface `IHttpControllerActivator`, which injects a custom controller activator into the Web API framework. In this case, when a request matches the type of the `TradingController`, the `ControlActivatorPublisher` transforms the controller in an `Observable` publisher, and then it registers the controller to the command dispatcher. The `tradingRequestObserver` observer, passed into the `CompositionRoot` constructor, is used as subscription for the `TradingController` controller, which can now dispatch messages from the actions to the subscribers in a reactive and decoupled manner. Ultimately, the sub-value of the subscribed observer `requestObserver` represents the subscription and must be registered for disposal together with the `TradingController` instance `tradingCtrl`, using the `request`.`RegisterForDispose` method. This listing shows the next step, the subscriber of the `TradingController` observable controller. Listing 14.3 Configuring the SignalR hub and agent message bus ``` type Startup() = let agent = new Agent<CommandWrapper>(fun inbox -> ① let rec loop () = async { let! (cmd:**CommandWrapper**) = inbox.Receive() do! cmd |> **AsyncHandle** ② return! loop() } loop()) do agent.Start() ① member this.Configuration(builder : IAppBuilder) = let config = let config = new HttpConfiguration() config.MapHttpAttributeRoutes() config.Services.Replace(typeof<IHttpControllerActivator>, ③ ControlActivatorPublisher(**Observer**.Create(fun x -> **agent**.**Post**(x)))) ④ let configSignalR = new **HubConfiguration**(EnableDetailedErrors = true) ⑤ Owin.CorsExtensions.UseCors(builder, Cors.CorsOptions.AllowAll) builder.**MapSignalR**(configSignalR) |> ignore builder.UseWebApi(**config**) |> ignore ``` The `Startup` function is executed when the web application begins to apply the configuration settings. This is where the `CompositionRoot` class (defined in Listing 14.2) belongs, to replace the default `IHttpControllerActivator` with its new instance. The subscriber type passed into the `ControlActivatorPublisher` constructor is an observer, which posts the messages that arrive from the `TradingController` actions to the `MailboxProcessor` agent instance. The `TradingController` publisher sends the messages through the `OnNext` method of the observer interface to all the subscribers, in this case the agent, which only depends on the `IObserver` implementation, and therefore reduces the dependencies. The `MailboxProcessor Post` method, `agent.Post`, publishes the wrapped message into a `Command` type using Rx. Note that the controller itself implements the `IObservable` interface, so it can be imagined as a message endpoint, command wrapper, and publisher. The subscriber `MailboxProcessor` agent asynchronously handles the incoming messages like a message bus, but at a smaller and more focused level (figure 14.8). A message bus provides a number of advantages, ranging from scalability to a naturally decoupled system to multiplatform interoperability. Message-based architectures that use a message bus focus on common message contracts and message passing. The rest of the configuration method enables the SignalR hubs in the application throughout the `IAppBuilder` provided.  Figure 14.8 The command and command-handler are implemented in Listing 14.4. This listing shows the implementation of the `AsyncHandle` function, which handles the agent messages in the form of CQRS commands. Listing 14.4 Command handler with async retry logic ``` module CommandHandler = let retryPublish = **RetryAsyncBuilder**(10, 250) ① let tradingCoordinator = TradingCoordinator.Instance() ② let Storage = new **EventStorage**() ③ let **AsyncHandle** (commandWrapper:CommandWrapper) = ④ let connectionId = commandWrapper.ConnectionId **retryPublish** { ① tradingCoordinator.PublishCommand( PublishCommand(connectionId, commandWrapper)) ⑤ let event = let cmd = commandWrapper.Command match cmd with ⑥ | BuyStockCommand(connId,trading) -> StocksBuyedEvent(commandWrapper.Id, trading) | SellStockCommand(connId, trading) -> StocksSoldEvent(commandWrapper.Id, trading) ⑥ let eventDescriptor = **Event.Create** (commandWrapper.Id, event) **Storage**.**SaveEvent** (Guid(connectionId)) eventDescriptor ⑦ } ``` The `retryPublish` is an instance of the custom `RetryAsyncBuilder` computation expression defined in Listing 9.4\. This computation expression aims to run operations asynchronously, and it retries the computation, with an applied delay, in case something goes wrong. `AsyncHandle` is a command handler responsible for executing the Command behaviors on the domain. The commands are represented as trading operations to either buy or sell stocks. In general, commands are directives to perform an action to the domain (behaviors). The purpose of `AsyncHandle` is to publish the commands received from the `TradingCoordinator` instance, the next step of the application pipeline, in a message-passing style. The command is the message received by the `MailboxProcessor` agent, defined during the application `Startup` (Listing 14.3). This message-driven programming model leads to an event-driven type of architecture, where the message-driven system recipients await the arrival of messages and react to them, otherwise lying dormant. In an event-driven system notification, the listeners are attached to the sources of events and are invoked when the event is emitted. The `AsyncHandle` handler is also responsible for transforming each command received into an `Event` type, which is then persisted in the event storage (figure 14.9). The event storage is part of the event sourcing strategy implementation to store the current state of the application in Listing 14.5.  Figure 14.9 The event storage is implemented in Listing 14.5. Listing 14.5 `EventBus` implementation using an agent ``` module EventBus = let public **EventPublisher** = new Event<Event>() ① let public **Subscribe** (eventHandle: Events.Event -> unit) = **EventPublisher**.**Publish** |> **Observable**.subscribe(eventHandle) ① let public Notify (event:Event) = **EventPublisher**.**Trigger** event ① module EventStorage = type EventStorageMessage = ② | **SaveEvent** of id:Guid * event:EventDescriptor | **GetEventsHistory** of Guid * AsyncReplyChannel<Event list option> type EventStorage() = ③ let eventstorage = **MailboxProcessor**.Start(fun inbox -> let rec loop (history:Dictionary<Guid, EventDescription list>) = async { ④ let! msg = inbox.Receive() match msg with | **SaveEvent**(id, event) -> ⑤ EventBus.Notify event.EventData ① match history.TryGetValue(id) with | true, events -> history.[id] <- (event :: events) | false, _ -> history.Add(id, [event]) | **GetEventsHistory**(id, reply) -> ⑥ match history.TryGetValue(id) with | true, events -> events |> List.map (fun i -> i.EventData) |> Some |> reply.Reply | false, _ -> reply.Reply(None) return! loop history } loop (Dictionary<Guid, EventDescriptor list>())) ④ member this.**SaveEvent**(id:Guid) (event:EventDescriptor) = ⑤ eventstorage.Post(SaveEvent(id, event)) member this.**GetEventsHistory**(id:Guid) = ⑥ eventstorage.PostAndReply(fun rep -> GetEventsHistory(id,rep)) |> Option.map(List.iter) ⑦ ``` The `EventBus` type is a simple implementation of a Publisher/Subscriber pattern over events. Internally, the `Subscribe` function uses Rx to register any given event, which is notified when the `EventPublisher` is triggered through the `Notify` function. The `EventBus` type is a convenient way to signal different parts of the application when a notification is emitted by a component upon reaching a given state. Events are the result of an action that has already happened, which is likely the output of executing a command. The `EventStorage` type is an in-memory storage for supporting the concept of event sourcing, which is basically the idea of persisting a sequence of state-changing events of the application, rather than storing the current state of an entity. In this way, the application is capable of reconstructing, at any given time, an entity’s current state by replaying the events. Because saving an event is a single operation, it’s inherently atomic. The `EventStorage` implementation is based on the F# agent `MailboxProcessor`, which guarantees thread safety for accessing the underlying event data-structure history `Dictionary<Guid, EventDescriptor list>`. The `EventStorageMessage` DU defines two operations to run against the event storage: * `SaveEvent` adds an `EventDescriptor` to the internal state of the event storage agent by the given unique ID. If the ID exists, then the event is appended. * `GetEventsHistory` retrieves the event history in ordered sequence by time within the given unique ID. In general, the event history is replayed using a given function action, as in Listing 14.5. The implementation uses an agent because it’s a convenient way to abstract away the basics of an event store. With that in place, you can easily create different types of event stores by changing only the two `SaveEvent` and `GetEventsHistory` functions. Let’s look at the `StockMarket` object shown in figure 14.10. Listing 14.6 shows the core implementation of the application, the `StockMarket` object.  Figure 14.10 The `StockMarket` object is implemented in Listing 14.6. Listing 14.6 `StockMarket` type to coordinate the user connections ``` type StockMarket (initStocks : Stock array) = ① **let subject = new Subject<Trading>()** ② static let instanceStockMarket = Lazy.Create(fun () -> StockMarket(Stock.InitialStocks())) let stockMarketAgent = Agent<StockTickerMessage>.Start(fun inbox -> let rec marketIsOpen (stocks : Stock array) ⑥ (stockTicker : IDisposable) = async { ③ let! msg = inbox.Receive() match msg with ④ | GetMarketState(c, reply) -> reply.Reply(MarketState.Open) return! marketIsOpen stocks stockTicker | GetAllStocks(c, reply) -> reply.Reply(stocks |> Seq.toList) return! marketIsOpen stocks stockTicker | UpdateStockPrices -> ⑤ stocks |> **PSeq.iter**(fun stock -> ⑧ let isStockChanged = updateStocks stock stocks isStockChanged |> Option.iter(fun _ -> **subject.OnNext**(Trading.UpdateStock(stock)))) return! marketIsOpen stocks stockTicker | CloseMarket(c) -> stockTicker.Dispose() return! marketIsClosed stocks | _ -> return! marketIsOpen stocks stockTicker } and marketIsClosed (stocks : Stock array) = async { ⑥ let! msg = inbox.Receive() match msg with ④ | GetMarketState(c, reply) -> reply.Reply(MarketState.Closed) return! marketIsClosed stocks | GetAllStocks(c,reply) -> reply.Reply((stocks |> Seq.toList)) return! marketIsClosed stocks | OpenMarket(c) -> return! marketIsOpen stocks (startStockTicker inbox) | _ -> return! marketIsClosed stocks } marketIsClosed (initStocks)) member this.GetAllStocks(connId) = stockMarketAgent.PostAndReply(fun ch -> GetAllStocks(connId, ch)) member this.GetMarketState(connId) = stockMarketAgent.PostAndReply(fun ch -> GetMarketState(connId, ch)) member this.OpenMarket(connId) = stockMarketAgent.Post(OpenMarket(connId)) member this.CloseMarket(connId) = stockMarketAgent.Post(CloseMarket(connId)) member this.AsObservable() = subject.AsObservable().SubscribeOn(TaskPoolScheduler.Default) ⑦ static member Instance() = instanceStockMarket.Value ``` The `StockMarket` type is responsible for simulating the stock market in the application. It uses operations such as `OpenMarket` and `CloseMarket` to either start or stop broadcasting notifications of stock updates, and `GetAllStocks` retrieves stock tickers to monitor and manage for users. The `StockMarket` type implementation is based on the agent model using the `MailboxProcessor` to take advantage of the intrinsic thread safety and convenient concurrent asynchronous message-passing semantic that’s at the core of building highly performant and reactive (event-driven) systems. The `StockTicker` price updates are simulated by sending high-rate random requests to the `stockMarketAgent` `MailboxProcessor` using `UpdateStockPrices`, which then notifies all active client subscribers. The `AsObservable` member exposes the `StockMarket` type as a stream of events throughout the `IObservable` interface. In this way, the type `StockMarket` can notify the `IObserver` subscribed to the `IObservable` interface of the stock updates, which are generated when the message `UpdateStock` is received. The function that updates the stock uses a Rx timer to push random values for each of the stock tickers registered, increasing or decreasing the prices with a small percentage, as shown here. Listing 14.7 Function to update the stock ticker prices every given interval ``` let startStockTicker (stockAgent : Agent<StockTickerMessage>) = Observable.Interval(TimeSpan.FromMilliseconds 50.0) |> Observable.subscribe(fun _ -> stockAgent.Post UpdateStockPrices) ``` `startStockTicker` is a fake service provider that tells `StockTicker` every 50 ms that it’s time to update the prices. The `TradingCoordinator` (figure 14.11) type’s purpose is to manage the underlying SignalR active connections and `TradingAgent` subscribers, which act as observers, through the `MailboxProcessor` `coordinatorAgent`. Listing 14.8 shows the implementation.  Figure 14.11 The trading coordinator is implemented in Listing 14.8. Listing 14.8 `TradingCoordinator` agent to handle active trading children agent ``` type CoordinatorMessage = ⑫ | Subscribe of id : string * initialAmount : float * ➥ caller:**IHubCallerConnectionContext**<IStockTickerHubClient> | Unsubscribe of id : string | PublishCommand of connId : string * CommandWrapper type TradingCoordinator() = ① //Listing 6.6 Reactive Publisher Subscriber in C# **let subject = new RxPubSub<Trading>()** ② static let tradingCoordinator = **Lazy.Create**(fun () -> new TradingCoordinator()) ③ let coordinatorAgent = Agent<CoordinatorMessage>.Start(fun inbox -> let rec loop (agents : Map<string, ➥ (**IObserver**<Trading> * IDisposable)>) = async { let! msg = inbox.Receive() match msg with | Subscribe(id, amount, caller) -> ④ let observer = **TradingAgent**(id, amount, caller) ⑤ **let dispObsrever = subject.Subscribe(observer)** observer.Agent **|> reportErrorsTo id supervisor |> startAgent** ⑥ caller.Client(id).SetInitialAsset(amount) ⑦ return! loop (Map.add id (observer :> ➥ IObserver<Trading>, dispObsrever) agents) | Unsubscribe(id) -> match Map.tryFind id agents with | Some(_, disposable) -> ⑧ disposable.Dispose() return! loop (Map.remove id agents) | None -> return! loop agents | PublishCommand(id, command) -> ⑨ match command.Command with | TradingCommand.BuyStockCommand(id, trading) -> match Map.tryFind id agents with | Some(a, _) -> let tradingInfo = { Quantity=trading.Quantity; Price=trading.Price; TradingType = TradingType.Buy} a.**OnNext**(Trading.Buy(trading.Symbol, tradingInfo)) return! loop agents | None -> return! loop agents | TradingCommand.SellStockCommand(id, trading) -> match Map.tryFind id agents with | Some(a, _) -> let tradingInfo = { Quantity=trading.Quantity; Price=trading.Price; TradingType = TradingType.Sell} a.**OnNext**(Trading.Sell(trading.Symbol, tradingInfo)) return! loop agents | None -> return! loop agents } loop (Map.empty)) member this.Subscribe(id : string, initialAmount : float, ➥ caller:IHubCallerConnectionContext<IStockTickerHubClient>) = coordinatorAgent.Post(Subscribe(id, initialAmount, caller)) ④ member this.Unsubscribe(id : string) = coordinatorAgent.Post(Unsubscribe(id)) member this.**PublishCommand**(command) = coordinatorAgent.Post(command) ⑩ member this.**AddPublisher**(observable : IObservable<Trading>) = subject.AddPublisher(observable) ⑪ static member Instance() = tradingCoordinator.Value ③ interface IDisposable with ⑬ member x.Dispose() = subject.Dispose() ``` The `CoordinatorMessage` discriminated union defines the messages for the `coordinatorAgent`. These message types are used for coordinating the operations for the underlying `TradingAgent`s subscribed for update notifications. You can think of the `coordinatorAgent` as an agent that’s responsible for maintaining the active clients. It either subscribes or unsubscribes them according to whether they’re connecting to the application or disconnecting from it, and then it dispatches operational commands to the active ones. In this case, the SignalR hub notifies the `TradingCoordinator` when a new connection is established or an existing one is dropped so it can register or unregister the client accordingly. The application uses the agent model to generate a new agent for each incoming request. For parallelizing request operations, the `TradingCoordinator` agent spawns new agents and assigns work via messages. This enables parallel I/O-bound operations as well as parallel computations. The `TradingCoordinator` exposes the `IObservable` interface through an instance of the `RxPubSub` type, which is defined in Listing 6.6\. `RxPubSub` is used here to implement a high-performant reactive Publisher/Subscriber*,* where the `TradingAgent` observers can register to receive potential notifications when a stock ticker price is updated. In other words, the `TradingCoordinator` is an `Observable` that the `TradingAgent` observer can subscribe to, implementing a reactive Publisher/Subscriber pattern to receive notifications. The method member `AddPublisher` registers any type that implements the `IObservable` interface, which is responsible for updating all the `TradingAgent`s subscribed. In this implementation, the `IObservable` type registered as `Publisher` in the `TradingCoordinator` is the `StockMarket` type. The `StockMarket` member methods `Subscribe` and `Unsubscribe` are used to register or unregister client connections received from the `StockTicker` SignalR hub. The requests to subscribe or unsubscribe are passed directly to the underlying `coordinatorAgent` observable type. The subscription operation triggered by the `Subscribe` message checks if a `TradingAgent` (figure 14.12) type exists in the local observer state, verifying the connection unique ID. If the `TradingAgent` doesn’t exist, then a new instance is created, and it’s subscribed to the subject instance by implementing the `IObserver` interface. Then, the supervision strategy `reportErrorsTo` (to report and handle errors) is applied to the newly created `TradingAgent` observer. This supervision strategy was discussed in section 11.5.5\.  Figure 14.12 The `TradingAgent` represents an agent-based portfolio for each user connected to the system. This agent keeps the user portfolio up to date and coordinates the operations of buying and selling a stock. The `TradingAgent` is implemented in Listing 14.9. Note that the `TradingAgent` construct takes a reference to the underlying SignalR channel, which is used to enable direct communication to the client, in this case a mobile device for real-time notifications. The trading operations `Buy` and `Sell` are dispatched to the related `TradingAgent`, which is identified using the unique ID from the local observer’s state. The dispatch operation is performed using the `OnNext` semantic of the `Observer` type. As mentioned, the `TradingCoordinator`'s responsibility is to coordinate the operations of the `TradingAgent`, whose implementation is shown in Listing 14.9. Listing 14.9 `TradingAgent` that represents an active user ``` type TradingAgent(connId : string, initialAmount : float, caller : ➥ IHubCallerConnectionContext<IStockTickerHubClient>) = ① let agent = new Agent<Trading>(fun inbox -> let rec loop cash (portfolio : Portfolio) (buyOrders : Treads) (sellOrders : Treads) = async { ② let! msg = inbox.Receive() match msg with | Kill(reply) -> reply.Reply() ③ | Error(exn) -> raise exn ③ | Trading.Buy(symbol, trading) -> ④ let items = setOrder buyOrders symbol trading let buyOrders = createOrder symbol trading TradingType.Buy caller.Client(connId).UpdateOrderBuy(buyOrders) return! loop cash portfolio items sellOrders | Trading.Sell(symbol, trading) -> ④ let items = setOrder sellOrders symbol trading let sellOrder = createOrder symbol trading TradingType.Sell caller.Client(connId).UpdateOrderSell(sellOrder) return! loop cash portfolio buyOrders items | Trading.UpdateStock(stock) -> ⑤ caller.Client(connId).UpdateStockPrice stock let cash, portfolio, sellOrders = updatePortfolio cash ➥ stock portfolio sellOrders TradingType.Sell let cash, portfolio, buyOrders = updatePortfolio cash ➥ stock portfolio buyOrders TradingType.Buy let asset = getUpdatedAsset portfolio sellOrders ➥ buyOrders cash ⑥ caller.Client(connId).UpdateAsset(asset) ⑦ return! loop cash portfolio buyOrders sellOrders } loop initialAmount (Portfolio(HashIdentity.Structural)) (Treads(HashIdentity.Structural)) (Treads(HashIdentity.Structural))) member this.Agent = agent interface IObserver<Trading> with ⑧ member this.OnNext(msg) = agent.Post(msg:Trading) member this.OnError(exn) = agent.Post(Error exn) ③ member this.OnCompleted() = agent.PostAndReply(Kill) ③ ``` The `TradingAgent` type is an agent-based object that implements the `IObserver` interface to allow sending messages to the underlying agent using a reactive semantic. Furthermore, because the `TradingAgent` type is an `Observer`, it can be subscribed to the `TradingCoordinator`, and consequently receives notifications automatically in the form of message passing. This is a convenient design to decouple parts of the application that can communicate by flowing messages in a reactive and independent manner. The `TradingAgent` represents a single active client, which means that there’s an instance of this agent for each user connected. As mentioned in chapter 11, having thousands of running agents (`MailboxProcessor`s) doesn’t penalize the system. The local state of the `TradingAgent` maintains and manages the current client portfolio, including the trading orders for buying and selling stocks. When either a `TradingMessage`.`Buy` or `TradingMessage`.`Sell` message is received, the `TradingAgent` validates the trade request, adds the operation to the local state, and then sends a notification to the client, which updates the local state of the transaction and the related UI. The `TradingMessage`.`UpdateStock` message is the most critical. The `TradingAgent` could potentially receive a high volume of messages, whose purpose it is to update the `Portfolio`s with a new stock price. More importantly, because the price of a stock could be changed in the update, the functionality triggered with the `UpdateStock` message checks if any of the existing (in-progress) trading operations, `buyOrders` and `sellOrders`, are satisfied with the new value. If any of the trades in progress are performed, the portfolio is updated accordingly, and the client receives a notification for each update. As mentioned, the `TradingAgent` entity keeps the channel reference of the connection to the client for communicating eventual updates, which is established during the `OnConnected` event in the SignalR hub (figure 14.13 and Listing 14.10).  Figure 14.13 The `StockTicker` SignalR hub is implemented in Listing 14.10. Listing 14.10 `StockTicker` SignalR hub ``` [<HubName("stockTicker")>] ① type StockTickerHub() as this = inherit **Hub**<IStockTickerHubClient>() ② let stockMarket : StockMarket = StockMarket.Instance() ③ let tradingCoordinator : TradingCoordinator = TradingCoordinator.Instance() ③ override x.OnConnected() = ④ let connId = x.Context.ConnectionId stockMarket.Subscribe(connId, 1000., this.Clients) ⑤ base.OnConnected() override x.OnDisconnected(stopCalled) = ④ let connId = x.Context.ConnectionId stockMarket.Unsubscribe(connId) ⑤ base.OnDisconnected(stopCalled) member x.GetAllStocks() = ⑥ let connId = x.Context.ConnectionId let stocks = stockMarket.GetAllStocks(connId) for stock in stocks do this.Clients.Caller.SetStock stock member x.OpenMarket() = ⑥ let connId = x.Context.ConnectionId stockMarket.OpenMarket(connId) this.Clients.All.SetMarketState(MarketState.Open.ToString()) member x.CloseMarket() = ⑥ let connId = x.Context.ConnectionId stockMarket.CloseMarket(connId) this.Clients.All.SetMarketState(MarketState.Closed.ToString()) member x.GetMarketState() = ⑥ let connId = x.Context.ConnectionId stockMarket.GetMarketState(connId).ToString() ``` The `StockTickerHub` class derives from the SignalR `Hub` class, which is designed to handle the connections, bidirectional interaction, and calls from clients. A SignalR `Hub` class instance is created for each operation on the hub, such as connections and calls from the client to the server. If you instead put state in the SignalR `Hub` class, then you’d lose it because the hub instances are transient. This is the reason you’re using the `TradingAgent`s to manage the mechanism that keeps stock data, updates prices, and broadcasts price updates. The Singleton pattern is a common option to keep alive an instance object inside a SignalR hub. In this case, you’re creating a singleton instance of the `StockMarket` type; and because its implementation is agent-based, there are no thread-race issues and performance penalties, as explained in section 3.1\. The SignalR base methods `OnConnected` and `OnDisconnected` are raised each time a new connection is established or dropped, and a `TradingAgent` instance is either created and registered or unregistered and destroyed accordingly. The other methods handle the stock market operations, such as opening and closing the market. For each of those operations, the underlying SignalR channel notifies the active clients immediately, as shown in the following listing. Listing 14.11 Client `StockTicker` interface to receive notifications using SignalR ``` interface IStockTickerHub { Task Init(string serverUrl, IStockTickerHubClient client); string ConnectionId { get; } Task GetAllStocks(); Task<string> GetMarketState(); Task OpenMarket(); Task CloseMarket(); } ``` The `IStockTickerHub` interface is used in the client side to define the methods in the SignalR `Hub` class that clients can call. To expose a method on the hub that you want to be callable from the client, declare a public method. Note that the methods defined in the interface can be long running, so they return a `Task` (or `Task<T>`) type designed to run asynchronously to avoid blocking the connection when the `WebSocket` transport is used. When a method returns a `Task` object, SignalR waits for the task to complete, and then it sends the unwrapped result back to the client. You’re using a Portable Class Library (PCL) to share the same functionality among different platforms. The purpose of the `IStockTickerHub` interface is to establish an ad hoc platform-specific contract for the SignalR hub implementation. In this way, each platform has to satisfy a precise definition of this interface, injected at runtime using the `DependencyService` class provider ([`mng.bz/vFc3`](http://mng.bz/vFc3)): ``` IStockTickerHub stockTickerHub = **DependencyService**.Get<IStockTickerHub>(); ``` After having defined the `IStockTickerHub` contract to establish the way that the client and server communicate, Listing 14.12 shows the implementation of the mobile application, in particular of the `ViewModel` class, which represents the core functionality. Several of the properties have been removed from the original source code, because repetitive logic could distract from the main objective of the example. Listing 14.12 Client-side mobile application using Xamarine.Forms ``` public class MainPageViewModel : ModelObject, IStockTickerHubClient { public MainPageViewModel(Page page) { Stocks = new ObservableCollection<StockModelObject>(); Portfolio = new ObservableCollection<Models.OrderRecord>(); BuyOrders = new ObservableCollection<Models.OrderRecord>(); SellOrders = new ObservableCollection<Models.OrderRecord>(); ① SendBuyRequestCommand = new Command(async () => await SendBuyRequest()); SendSellRequestCommand = new Command(async () => await SendSellRequest()); ② stockTickerHub = DependencyService.Get<IStockTickerHub>(); ③ hostPage = page; var hostBase = "http://localhost:8735/"; stockTickerHub ③ .Init(hostBase, this) .ContinueWith(async x => { var state = await stockTickerHub.GetMarketState(); isMarketOpen = state == "Open"; OnPropertyChanged(nameof(IsMarketOpen)); OnPropertyChanged(nameof(MarketStatusMessage)); await stockTickerHub.GetAllStocks(); }, TaskScheduler.FromCurrentSynchronizationContext()); ④ client = new HttpClient(); client.BaseAddress = new Uri(hostBase); client.DefaultRequestHeaders.Accept.Clear(); client.DefaultRequestHeaders.Accept.Add( new MediaTypeWithQualityHeaderValue("application/json")); ⑤ } private IStockTickerHub stockTickerHub; private HttpClient client; private Page hostPage; public Command SendBuyRequestCommand { get; } public Command SendSellRequestCommand { get; } ② private double price; public double Price ⑥ { get => price; set { if (price == value) return; price = value; OnPropertyChanged(); } } private async Task SendTradingRequest(string url) ⑦ { if (await Validate()) { var request = new ➥ TradingRequest(stockTickerHub.ConnectionId, Symbol, Price, Amount); var response = await client.PostAsJsonAsync(url, request); response.EnsureSuccessStatusCode(); } } private async Task SendBuyRequest() => await SendTradingRequest("/api/trading/buy"); ⑦ private async Task SendSellRequest() => await SendTradingRequest("/api/trading/sell"); ⑦ public ObservableCollection<Models.OrderRecord> Portfolio { get; } public ObservableCollection<Models.OrderRecord> BuyOrders { get; } public ObservableCollection<Models.OrderRecord> SellOrders { get; } public ObservableCollection<StockModelObject> Stocks { get; } ① public void UpdateOrderBuy(Models.OrderRecord value) => BuyOrders.Add(value); ⑧ public void UpdateOrderSell(Models.OrderRecord value) => SellOrders.Add(value); ⑧ } ``` The class `MainPageViewModel` is the `ViewModel` component of the mobile client application, which is based on the MVVM pattern (http://mng.bz/qfbR) to enable communication and data binding between the UI (View) and the ViewModel. In this way, the UI and the presentation logic have separate responsibilities, providing a clear separation of concerns in the application. Note that the class `MainPageViewModel` implements the interface `IStockTickerHubClient`, which permits the notifications from the SignalR channel after the connection is established. The interface `IStockTickerHubClient` is defined in the StockTicker.Core project, and it represents the contract for the client that the server relies on. This code snippet shows the implementation of this interface: ``` type IStockTickerHubClient = abstract SetMarketState : string -> unit abstract UpdateStockPrice : Stock -> unit abstract SetStock : Stock -> unit abstract UpdateOrderBuy : OrderRecord -> unit abstract UpdateOrderSell : OrderRecord -> unit abstract UpdateAsset : Asset -> unit abstract SetInitialAsset : float -> unit ``` These notifications will flow in the application automatically from the server side into the mobile application, updating the UI control in real time. In Listing 14.12, the observable collections defined at the top of the class are used to communicate in a bidirectional manner with the UI. When one of these collections is updated, the changes are propagated to the bind UI controllers to reflect the state (http://mng.bz/nvma). The `Command` of the ViewModel is used to define a user operation, which is data bound to a button, to send the request asynchronously to the web server to perform a trade for the stock defined in the UI.^(1) The request is executed, launching the `SendTradingRequest` method that’s used to buy or sell a stock according to the API endpoint targeted. **The SignalR connection is established through the initialization of the `stockTickerHub` interface, and an instance is created by calling the `DependencyService.Get<IStockTickerHub>` method. After the creation of the `stockTickerHub` instance, the application initialization is performed by calling the `Init` method, which calls the remote server for locally loading the stocks with the method `stockTickerHub.GetAllStocks` and the current state of the market with the method `stockTickerHub.GetMarketState` to update the UI. The application initialization is performed asynchronously using the `FromCurrentSynchronizationContext` `TaskScheduler`, which provides functionality for propagating updates to the UI controllers from the main UI thread without the need to apply any thread-marshal operation. Ultimately, the application receives the notifications from the SignalR channel, which is connected to the stock market server, through the invocation of the methods defined in the `IStockTickerHubClient` interface. These methods are `UpdateOrderBuy`, `UpdatePortofolio`, and `UpdateOrderSell`, which are responsible updating the UI controllers by changing the relative observable collections. ### 14.6.1 Benchmark to measure the scalability of the stock ticker application The stock ticker application was deployed on Microsoft Azure Cloud with a medium configuration (two cores and 3.5 GB of RAM), and stress-tested using an online tool to simulate 5,000 concurrent connections, each generating hundreds of HTTP requests. This test aimed to verify the web server performance under excessive loads to ensure that critical information and services are available at speeds that end users expect. The result was green, validating that web server application can sustain many concurrent active users and cope with excessive loads of HTTP requests. ## Summary * Conventional web applications can be thought of as embarrassingly parallel because requests are entirely isolated and easy to execute independently. The more powerful the server running the application, the more requests it can handle. * You can effortlessly parallelize and distribute a stateless program among computers and processes to scale out performance. There’s no need to maintain any state where the computation runs, because no part of the program will modify any data structures, avoiding data races. * Asynchronicity, caching, and distribution (ACD) are the secret ingredients when designing and implementing a system capable of flexing to an increase (or decrease) of requests with a commensurate parallel speedup with the addition of resources. * You can use Rx to decouple an ASP.NET Web API and push the messages originated by the incoming requests from the controller to other components of the subscriber application. These components could be implemented using the agent programming model, which spawns a new agent for each established and active user connection. In this way, the application can be maintained in an isolated state per user, and provide an easy opportunity for scalability. * The support provided for concurrent FP in .NET is key to making it a great tool for server-side programming. Support exists for running operations asynchronously in a declarative and compositional semantic style; additionally, agents can be used to develop thread-safe components. These core technologies can be combined for declarative processing of events and for efficient parallelism with TPL. * Event-driven architecture (EDA) is an application design style that builds on the fundamental aspects of event notifications to facilitate immediate information dissemination and reactive business process execution. In an EDA, information is propagated in real time throughout a highly distributed environment, enabling the different components of the application that receive a notification to proactively respond to business activities. EDA promotes low latency and a highly reactive system. The difference between event-driven and message-driven systems is that event-driven systems focus on addressable event sources; message-driven systems concentrate on addressable recipients.**```` `````
A
函数式编程
说学习 FP 使你成为一个更好的程序员是轶事性的。确实,FP 提供了一种替代方案,通常更简单,可以用来思考问题。此外,许多 FP 技术可以成功地应用于其他语言。无论你使用什么语言,以函数式风格进行编程都能带来好处。
FP 与其说是一套特定的工具或语言,不如说是一种心态。熟悉不同的编程范式使你成为一个更好的程序员,而多范式程序员比多语言程序员更有力量。因此……
由于本书的章节已经整理好了技术背景,这个附录不涉及 FP 在并发方面的应用,如不可变性、引用透明性、无副作用函数和惰性评估。相反,它涵盖了关于 FP 是什么以及为什么你应该关心它的基本信息。
什么是函数式编程?
FP 对不同的人意味着不同的事情。它是一种将计算视为表达式评估的程序范式。科学中的范式描述了不同的概念或思维模式。
FP 涉及使用状态和可变数据来解决领域问题,它基于λ演算。因此,函数是一等值。
FP 是一种编程风格,它通过评估表达式与执行语句之间的推理来进行。术语表达式来自数学;表达式总是返回一个结果(值),而不改变程序状态。一个语句不返回任何内容,并且可以改变程序状态:
-
语句执行指的是将程序表达为一系列命令或语句。命令指定如何通过创建对象和操作它们来实现最终结果。
-
表达式评估指的是程序如何指定你想要作为结果获取的对象属性。你不需要指定构建对象所需的步骤,并且不能在对象创建之前意外地使用该对象。
函数式编程的好处
这里是 FP(函数式编程)的好处列表:
-
可组合性和模块化——通过引入纯函数,你可以组合函数并从简单函数中创建高级抽象。使用模块,程序可以以更好的方式组织。可组合性是战胜复杂性的最有力的工具;它让你能够定义和构建复杂问题的解决方案。
-
表达性——你可以用简洁和声明性的格式表达复杂的思想,提高意图的清晰度和对程序进行推理的能力,并减少代码复杂性。
-
可靠性和测试—函数不存在副作用;一个函数只评估并返回一个依赖于其参数的值。因此,你可以通过仅关注其参数来检查函数,这允许进行更好的测试,以便轻松验证代码的正确性。
-
更易并发的实现—并发鼓励引用透明性和不可变性,这是在多个核心上有效运行正确、无锁的并发应用程序的主要关键。
-
惰性评估—你可以按需检索函数的结果。假设你有一个大数据流要分析。多亏了 LINQ,你可以使用延迟执行和惰性评估按需(仅在需要时)处理你的数据分析。
-
生产力—这是一个巨大的好处:你可以用更少的代码行数实现与其他范式相同的功能。生产力减少了开发程序所需的时间,这可以转化为更大的利润空间。
-
正确性—你可以编写更少的代码,自然地减少可能出现的错误数量。
-
可维护性——这种好处来自于其他好处,例如代码可组合、模块化、表达性强和正确。
学习函数式编程将导致代码更加模块化、面向表达式、概念简单。这些函数式编程资产的组合让你能够理解你的代码正在做什么,无论它正在执行多少线程。
函数式编程的原则
函数式编程(FP)有四个主要原则,这些原则导致了一种可组合和声明式的编程风格:
-
高阶函数(HOFs)作为一等值
-
不变性
-
纯函数,也称为无副作用函数
-
声明式编程风格
程序范式的冲突:从命令式到面向对象再到函数式编程
面向对象编程通过封装移动部分使代码易于理解。函数式编程通过最小化移动部分来使代码易于理解。
—迈克尔·费思,著有《与遗留代码一起工作》,通过推特
本节描述了三种编程范式:
-
命令式编程通过改变程序状态和定义执行命令的顺序来描述计算。因此,命令式范式是一种通过计算一系列语句来改变状态的风格。
-
函数式编程通过将计算视为表达式的评估来构建程序的结构和元素;因此,FP 促进不可变性和避免状态。
-
面向对象编程(OOP)组织对象而不是动作,其数据结构包含数据而不是逻辑。主要的编程范式可以分为命令式和函数式。OOP 与命令式和函数式编程是正交的,这意味着它可以与两者结合使用。你不必偏好一种范式而牺牲另一种,但你可以使用函数式或命令式概念编写具有 OOP 风格的软件。
面向对象编程(OOP)已经存在了近二十年,其设计原则被 Java、C#和 VB.Net 等语言所采用。OOP 之所以取得巨大成功,是因为它能够表示和模拟用户域,提高了抽象级别。OOP 语言引入背后的主要思想是代码复用性,但这一思想常常被特定场景和临时对象所需的修改和定制所扭曲。使用低耦合和良好代码复用性开发的 OOP 程序就像一个复杂的迷宫,有许多秘密和复杂的通道,降低了代码的可读性。
为了减轻这种难以实现的代码复用性,开发者开始创建设计模式来应对面向对象编程(OOP)繁琐的本质。设计模式鼓励开发者根据模式定制软件,使得代码库更加复杂、难以理解,并且在某些情况下,尽管可维护但仍然远未达到可复用的程度。在 OOP 中,设计模式在定义解决重复设计问题时是有用的,但它们可以被视为语言本身抽象的缺陷。
在 FP 中,设计模式有不同的含义;实际上,由于抽象级别更高以及作为构建块使用的高阶函数,大多数特定于 OOP 的设计模式在函数式语言中都是不必要的。FP 风格中更高的抽象级别和围绕低级细节的减少工作量具有产生更短程序的优势。当程序变得小的时候,它更容易理解、改进和验证。FP 对代码复用和减少重复代码有出色的支持,这是编写更少出错代码的最有效方式。
用于提高抽象级别的高阶函数
HOF 的原则意味着函数可以作为参数传递给其他函数,并且函数可以在它们的返回值中返回不同的函数。.NET 有泛型委托的概念,如Action<T>和Func<T, TResult>,可以用作 HOF 来传递带有 lambda 支持的函数作为参数。以下是在 C#中使用泛型委托Func<T,R>的示例:
Func<int, double> fCos = n => Math.Cos( (double)n );
double x = fCos(5);
IEnumerable<double> values = Enumerable.Range(1, 10).Select(fCos);
相应的代码可以用 F#中的函数语义来表示,无需显式使用Func<T, TResult>委托:
let fCos = fun n -> Math.Cos( double n )
let x = fCos 5
let values = [1..10] |> List.map fCos
HOFs(高阶函数)是利用 FP(函数式编程)力量的核心。HOFs 具有以下优势:
-
组合和模块化
-
代码复用性
-
能够创建高度动态和适应性的系统
在函数式编程(FP)中,函数被视为一等值,这意味着函数可以被变量命名,可以被分配给变量,并且可以出现在任何其他语言构造可以出现的地方。如果你来自直接的面向对象(OOP)经验,这个概念允许你以非规范的方式使用函数,例如将相对通用的操作应用于标准数据结构。高阶函数(HOFs)让你专注于结果,而不是步骤。这是在处理函数式语言时的一个基本且强大的转变。不同的函数式技术允许你实现函数式组合:
-
组合
-
柯里化
-
部分应用函数或部分应用
使用委托的强大功能可以表达针对不仅执行单一操作的方法,还可以增强、重用和扩展的行为引擎的功能。这种编程风格是函数范式的基础,其优点是减少了代码重构的数量:而不是拥有多个专用且僵化的方法,程序可以通过更少但更通用和可重用的方法来表示,这些方法可以扩展以处理多个和不同的场景。
高阶函数和 lambda 表达式用于代码重用
使用 lambda 表达式的一个许多有用的原因是为了重构代码,减少冗余。在像 C# 这样的内存管理语言中,尽可能确定性地释放资源是一种良好的实践。考虑以下示例:
string text;
using (var stream = new StreamReader(path))
{
text = stream.ReadToEnd();
}
在此代码中,StreamReader 资源通过 using 关键字进行释放。这是一个众所周知的模式,但存在局限性。由于可释放变量是在 using 范围内声明的,因此该模式不可重用,一旦释放后便无法重用,并且如果它调用已释放的对象,会引发异常。以经典面向对象(OOP)风格重构代码并非易事。可以使用模板方法模式,但这种解决方案也引入了更多复杂性,需要为每个派生类创建新的基类和实现。一个更好且更优雅的解决方案是使用 lambda 表达式(匿名委托)。以下是实现静态辅助方法及其使用的代码:
R Using<T,R>(this T item, Func<T, R> func) where T : **IDisposable** {
using (item)
return func(item);
}
string text = new StreamReader(path).**Using**(stream => stream.ReadToEnd());
此代码实现了一个灵活且可重用的清理可释放资源的模式。这里唯一的约束是泛型类型 T 必须是实现 IDisposable 的类型。
Lambda 表达式和匿名函数
术语 lambda 或 lambda 表达式 最常指匿名函数。lambda 表达式的目的是基于函数表达计算,使用变量绑定和替换。用更简单的话说,lambda 表达式是替代委托实例的无名方法,引入了 匿名函数 的概念。
Lambda 表达式提高了抽象级别,从而简化了编程体验。基于 Lambda 演算的函数式语言,如 F#,用于表达对函数抽象的计算;因此,Lambda 表达式是 FP 语言的一部分。然而,在 C# 中,引入 Lambda 的主要动机是促进基于流的抽象,这使基于声明的 API 成为可能。这种抽象提供了一条通往多核并行处理的自然路径,使 Lambda 表达式成为当前计算领域的一个宝贵工具。
要创建 Lambda 表达式,您需要在 Lambda 操作符 =>(发音为“goes to”)的左侧指定输入参数(如果有),并将表达式或语句块放在右侧。例如,Lambda 表达式 (x, y) => x + y 指定了两个参数 x 和 y,并返回这些值的总和。
每个 Lambda 表达式有三个部分:
-
(x, y)— 一组参数。 -
=>— 分隔参数列表和结果表达式的“goes to”运算符(=>)。 -
x + y— 一组执行操作或返回值的语句。在这个例子中,Lambda 表达式返回x和y的总和。
下面是如何实现具有相同行为的三个 Lambda 表达式:
Func<int, int, int> add = delegate(int x, int x){ return x + y; };
Func<int, int, int> add = (int x, int y) => { return x + y; };
Func<int, int, int> add = (x, y) => x + y
Func<int, int, int> 这一部分定义了一个函数,它接受两个整数并返回一个新的整数。
在 F# 中,强类型系统可以在没有显式声明的情况下将名称或标签绑定到函数。F# 函数是原始值,类似于整数和字符串。可以将前面的函数翻译成等效的 F# 语法,如下所示:
let add = (fun x y -> x + y)
let add = (+)
在 F# 中,*plus*(+)运算符是一个与 *add* 具有相同签名的函数,它接受两个数字并返回它们的和作为结果。
Lambda 表达式是一种简单而有效的解决方案,用于分配和执行一段内联代码块,尤其是在代码块仅服务于特定目的且不需要将其定义为方法的情况下。将 Lambda 表达式引入代码中有许多优点。以下是一个简短的列表:
-
您不需要对类型进行显式参数化;编译器可以确定参数类型。
-
简洁的内联编码(函数存在于行内)避免了开发者必须在其他地方查找功能时造成的干扰。
-
捕获变量限制了类级别变量的暴露。
-
Lambda 表达式使代码的流程易于阅读和理解。
Currying
“柯里化”这个术语起源于 Haskell Curry,他是函数式编程发展的重要影响者,一位数学家。柯里化是一种让你模块化函数和重用代码的技术。基本思想是将接受多个参数的函数的评估转换为一系列函数的评估,每个函数只有一个参数。函数式语言与数学概念密切相关,其中函数只能有一个参数。F# 遵循这一概念,因为具有多个参数的函数被声明为一系列新函数,每个函数只有一个参数。
在实践中,其他 .NET 语言具有具有多个参数的函数;从面向对象的角度来看,如果你没有传递给函数所有期望的参数,编译器会抛出异常。相反,在函数式编程(FP)中,编写一个返回任何你给出的函数的柯里化函数非常容易。但如前所述,lambda 表达式提供了创建匿名委托的出色语法,从而使得实现柯里化函数变得简单。此外,在支持闭包的任何编程语言中都可以实现柯里化——这是一个有趣的概念,因为这种技术简化了 lambda 表达式,包括只有单个参数的函数。
柯里化技术使得可以将所有具有一个或多个参数的函数视为只接受一个参数,而不考虑执行所需的参数数量。这创建了一个函数链,其中每个函数只消耗一个参数。
在这个函数链的末尾,所有参数都一次性可用,这使得原始函数可以执行。此外,柯里化允许你创建由固定基函数参数生成的专用函数组。例如,当你对一个具有两个参数的函数进行柯里化并将其应用到第一个参数时,功能被限制在一维。这不是限制,而是一种强大的技术,因为然后你可以将新函数应用到第二个参数来计算特定的值。
在数学符号中,这两个函数之间存在一个重要的区别:
Add(x, y, z)
Add x y z
差别在于第一个函数接受一个类型为 tuple 的单个参数(由三个项目 x、y 和 z 组成),而第二个函数接受输入项 x 并返回一个接受输入项 y 的函数,该函数返回一个接受项目 z 的函数,然后返回最终计算的结果。用更简单的话说,等效函数可以重写为
(((Add x) y) z)
重要的是要提到,函数应用是左结合的,一次只接受一个参数。之前的函数 Add 是对 x 的应用,然后结果被应用到 y 上。这个应用的结果 ((Add x) y) 然后被应用到 z 上。因为每个过渡步骤都产生一个函数,所以定义一个函数如下是完全可行的
Plus2 = Add 2
这个函数等同于Add x。在这种情况下,你可以期待函数Plus2接受两个输入参数,并且它总是传递 2 作为固定参数。为了清晰起见,可以将前面的函数重写如下:
Plus2 x = Add 2 x
产生中间函数(每个函数接受一个输入参数)的过程称为柯里化。让我们看看柯里化的实际应用。考虑以下使用 lambda 表达式的简单 C#函数:
Func<int,int,int> add = (x,y) => x + y;
Func<int,Func<int,int>> curriedAdd = x => y => x + y;
这段代码定义了一个函数Func<int, int, int> add,它接受两个整数作为参数并返回一个整数。当调用此函数时,编译器需要两个参数x和y。但函数add的柯里化版本curriedAdd导致一个具有特殊签名Func<int,Fun< int, int>>的委托。
通常,任何类型为Func<A,B,R>的委托都可以转换为类型为Func<A, Func<B,R>>的委托。这个柯里化函数只接受一个参数,并返回一个函数,该函数接受原始函数作为参数,然后返回类型为A的值。柯里化函数curriedAdd可以用来创建强大的专用函数。例如,你可以通过添加值 1 来定义一个increment函数:
Func<int,int> increment = curriedAdd(1)
现在,你可以使用这个函数来定义其他执行多种形式的加法操作的函数:
int a = curriedAdd(30)
int b = increment(41)
Func<int, int> add30 = curriedAdd(30)
int c = add30(12)
函数柯里化的一个好处是创建专用函数更容易重用;但真正的力量在于柯里化函数引入了一个有用的概念,称为部分应用函数,这在下一节中会介绍。柯里化技术的其他好处包括函数参数减少和易于重用的抽象函数。
C#中的自动柯里化
使用扩展方法可以在 C#中自动化并提高柯里化技术的抽象级别。在这个例子中,curry 扩展方法的目的是通过引入语法糖来隐藏柯里化实现:
static Func<A, Func<B, R>> Curry<A, B, R>(this Func<A, B, R> function)
{
return a => b => function(a, b);
}
这是使用辅助扩展方法重构的代码:
Func<int,int,int> add = (x,y) => x + y;
Func<int,Func<int,int>> curriedAdd = add.Curry();
这种语法看起来更简洁。重要的是要注意,编译器可以推断出所有函数中使用的类型,并且这对于这一点非常有帮助。事实上,尽管Curry是一个泛型函数,但不需要显式传递泛型参数。使用这种柯里化技术让你可以使用不同的语法,这更有利于从简单函数构建复杂复合函数的库。你可以作为本书资源的部分下载的源代码中包含了一个包含完整辅助方法实现的库,包括自动柯里化的扩展方法。
取消柯里化
就像将柯里化技术应用于一个函数一样,你可以通过使用高阶函数来逆转柯里化函数,从而取消柯里化一个函数。显然,取消柯里化是柯里化的相反转换。将取消柯里化视为通过应用通用的取消柯里化函数来撤销柯里化的技术。
在以下示例中,具有签名Func<A, Func<B, R>>的柯里化函数将被转换回一个多参数函数:
public static Func<A, B, R> Uncurry<A, B, R>(Func<A, Func<B, R>> function)
=> (x, y) => function(x)(y);
函数去柯里化的主要目的是将柯里化函数的签名转换回更面向对象(OOP)的风格。
F#中的柯里化
在 F#中,函数声明默认是柯里化的。尽管这是编译器为你自动完成的,但了解 F#如何处理柯里化函数是有帮助的。
以下示例显示了两个 F#函数,它们用于乘以两个值。如果你不熟悉 F#,这些函数可能看起来等效或至少相似,但它们并不相同:
let multiplyOne (x,y) = x * y
let multiplyTwo x y = x * y
let resultOne = multiplyOne(7, 8)
let resultTwo = multiplyTwo 7 8
let values = (7,8)
let resultThree = multiplyOne values
除了语法之外,这些函数之间没有明显的差异,但它们的行为不同。第一个函数只有一个参数,它是一个包含所需值的元组,但第二个函数有两个不同的参数x和y。
当你查看这些函数声明的签名时,差异变得明显:
val multiplyOne : (int * int) -> int
val multiplyTwo : int -> int -> int
现在很明显,这些函数是不同的。第一个函数接受一个元组作为输入参数并返回一个整数。第二个函数接受一个整数作为其第一个输入,并返回一个接受整数作为输入并返回整数的函数。这个接受两个参数的第二个函数被编译器自动转换成一系列函数,每个函数有一个输入参数。
此示例显示了等效的柯里化函数,这是编译器为你解释的方式:
let multiplyOne x y = x * y
let multiplyTwo = fn x -> fun y -> x * y
let resultOne = multiplyOne 7 8
let resultTwo = multiplyTwo 7 8
let resultThree =
let tempMultiplyBy7 = multiplyOne 7
tempMultiplyBy7 8
在 F#中,这些函数的实现是等效的,因为,如前所述,它们默认是柯里化的。柯里化的主要目的是为了优化函数,以便更容易地进行部分应用。
部分应用函数
部分应用函数(或部分函数应用)是将多个参数固定到函数中并产生另一个较小阶数(函数的阶数是它的参数数量)的函数的技术。通过这种方式,部分函数提供了一个比预期更少的参数的函数,为给定的值产生一个专门的函数。除了函数组合外,部分应用函数还使函数模块化成为可能。
更简单地说,部分函数应用是一个将值绑定到参数的过程,这意味着部分应用函数是通过使用固定(默认)值来减少函数参数数量的函数。如果你有一个具有N个参数的函数,你可以创建一个具有N-1个参数的函数,该函数使用固定参数调用原始函数。因为部分应用依赖于柯里化,所以这两种技术同时发生。部分应用与柯里化的区别在于,部分应用将多个参数绑定到一个值上,因此要评估函数的其余部分,你需要应用剩余的参数。
通常,部分应用将一个通用函数转换为一个新且专门的函数。让我们以 C# 的柯里化函数为例:
Func<int,int,int> add = (x,y) => x + y;
你如何创建一个单参数的新函数?
在这种情况下,部分函数应用变得有用,因为你可以对原始函数的第一个参数应用一个默认值来部分应用一个函数到一个高阶函数 (HOF)。以下是可以用来部分应用函数的扩展方法:
static Func<B, R> Partial<A, B, R>(this Func<A, B, R> function, A argument)
=> argument2 => function(argument, argument2);
下面是一个练习这种技术的例子:
Func<int, int, int> max = Math.Max;
Func<int, int> max5 = max.Partial(5);
int a = max5(8);
int b = max5(2);
int c = max5(12);
Math.Max(int, int) 是一个可以通过部分应用函数进行扩展的函数示例。在这种情况下引入部分应用函数,默认参数 5 被固定,它创建了一个新的专用函数 max5,该函数在两个数字之间评估最大值,默认为 5。多亏了部分应用,你从一个现有的函数中创建了一个新的、更具体的函数。
从面向对象的角度来看,将部分函数应用视为覆盖函数的一种方式。还可以使用这种技术来扩展第三方库的即时功能,该库本身不可扩展。
如前所述,在 F# 中函数默认是柯里化的,这比在 C# 中创建部分函数更容易。部分函数应用有许多好处,包括以下内容:
-
它们允许函数在没有犹豫的情况下被组合。
-
它们通过避免构建包含具有不同输入数量重写版本的同一种方法的不必要类,减轻了传递一组单独参数的需求。
-
它们使开发者能够通过参数化其行为来编写高度通用的函数。
使用部分函数应用的实际好处是,只提供部分参数构建的函数有利于代码重用、功能扩展性和组合。此外,部分应用函数简化了在编程风格中使用高阶函数。部分函数应用还可以延迟以改进性能,这在第 2.6 节中介绍过。
C# 中部分函数应用和柯里化的力量
让我们考虑一个更完整的部分函数应用和柯里化的例子,它可以涵盖实际使用场景。Retry 在 列表 A.1 中是一个为任何无参数且返回类型为 T 的函数的 Func<T> 委托的扩展方法。此方法的目的是在 try-catch 块中执行传入的函数,如果在执行过程中抛出异常,则函数将重试操作,最多重试三次。
列表 A.1 中的 Retry 扩展方法
public static T Retry<T>(this Func<T> function) ①
{
int retry = 0; ②
T result = default(T); ③
bool success = false;
do{
try {
result = function(); ④
success = true; ⑤
}
catch {
retry++; ⑥
}
} while (!success && retry < 3); ⑥
return result;
}
假设这种方法试图从文件中读取文本。在下面的代码中,方法ReadText接受一个文件路径作为输入并返回文件中的文本。为了执行带有附加的Retry行为的功能,以便在出现问题时回退并恢复,你可以使用闭包,如下所示:
static string ReadText(string filePath) => File.ReadAllText(filePath);
string filePath = "TextFile.txt";
Func<string> readText = () => ReadText(filePath);
string text = readText.Retry();
你可以使用 lambda 表达式捕获局部变量filePath并将其传递给ReadText方法。这个过程让你可以创建一个与Retry扩展方法的签名匹配的Func<string>,并将其附加。如果文件被阻塞或被另一个进程拥有,将抛出错误,并且Retry功能将按预期启动。如果第一次调用失败,该方法将第二次和第三次重试。最后,它返回类型T的默认值。
这确实可行,但你可能会想知道如果你想要重试一个需要字符串参数的函数会发生什么。解决方案是部分应用该函数。下面的代码实现了一个接受字符串参数的函数,该参数是从中读取文本的文件路径,然后将其传递给ReadText方法。由于Retry行为只适用于不接受参数的函数,所以代码无法编译:
Func<string, string> readText = (path) => ReadText(path);
string text = readText.Retry();
string text = readText(filePath).Retry();
Retry的行为不适用于这个版本的readText。一个可能的解决方案是编写另一个版本的Retry方法,该方法接受一个额外的泛型类型参数,该参数指定了在调用时需要传递的参数类型。这不是最佳选择,因为你必须弄清楚如何将这种新的Retry逻辑共享到所有使用它的方法中,每个方法都有不同的参数或实现。
更好的选择是使用并组合柯里化和部分函数应用。在下面的列表中,辅助方法Curry和Partial被定义为扩展方法。
列表 A.2 C#中的Retry辅助扩展
static class RetryExtensions
{
public static Func<R> Partial<T, R>(this Func<T, R> function, T arg){
return () => function(arg);
}
public static Func<T, Func<R>> Curry<T, R>(this Func<T, R> function){
return arg => () => function(arg);
}
}
Func<string, string> readText = (path) => ReadText(path);
string text = readText.Partial("TextFile.txt").Retry();
Func<string, Func<string>> curriedReadText = readText.Curry();
string text = curriedReadText("TextFile.txt").Retry();
这种方法让你可以注入文件路径并平滑地使用Retry函数。这是因为辅助函数Partial和Curry都将readText函数转换为一个不需要参数的函数,最终与Retry的签名相匹配。
B
F# 概述
本附录探讨了 F# 的基本语法,F# 是一种已确立的通用目的函数式第一语言,同时支持面向对象编程(OOP)。实际上,F# 融入了 .NET 公共语言基础设施(CLI)对象模型,这允许声明接口、类和抽象类。此外,F# 是一种静态和强类型语言,这意味着编译器可以在编译时检测变量和函数的数据类型。F# 的语法与 C 风格语言(如 C#)不同,因为代码块不是用花括号来界定的。此外,空格而不是逗号和缩进对于分隔参数和界定函数体的作用域非常重要。另外,F# 是一种跨平台编程语言,可以在 .NET 生态系统内外运行。
let 绑定
在 F# 中,let 是最重要的关键字之一,它将一个标识符绑定到一个值,这意味着给值命名(或,将值绑定到名称)。它定义为 let <identifier> = <value>。
let 绑定默认是不可变的。以下是一些代码示例:
let myInt = 42
let myFloat = 3.14
let myString = "hello functional programming"
let myFunction = fun number -> number * number
如您从最后一行所见,您可以通过将标识符 myFunction 绑定到 lambda 表达式 fun number -> number * number 来命名一个函数。
fun 关键字用于在语法中定义 lambda 表达式(匿名函数),形式为 fun args -> body。有趣的是,你不需要在代码中定义类型,因为由于它强大的内置类型推断系统,F# 编译器可以原生地理解它们。例如,在上面的代码中,编译器推断出 myFunction 函数的参数是一个数字,因为乘法 (*) 操作符的存在。
理解 F# 中的函数签名
在 F# 中,与大多数函数式语言一样,函数签名使用从左到右读取的箭头符号定义。函数是总是有输出的表达式,所以最右边的箭头总是指向返回类型。例如,当你看到 typeA -> typeB 时,你可以将其解释为一个接受 typeA 类型的输入值并产生 typeB 类型的值的函数。相同的原理也适用于接受两个以上参数的函数。当函数的签名是 typeA -> typeB -> typeC 时,你从左到右读取箭头,这创建了两个函数。第一个函数是 typeA -> (typeB -> typeC),它接受 typeA 类型的输入并产生 typeB -> typeC 类型的函数。
这是 add 函数的签名:
val add : x:int -> y:int -> int
这接受一个参数 x:int 并返回一个函数,该函数接受 y:int 作为输入并返回一个 int 类型的结果。箭头符号与柯里化和匿名函数内在相关。
创建可变类型:mutable 和 ref
FP(函数式编程)中的一个主要概念是不可变性。F#是一种以函数为主的编程语言;但显式使用immutable关键字可以让您创建类似变量的可变类型,如下例所示:
let mutable myNumber = 42
现在可以使用 goes-to (<-``)运算符来更改myNumber的值:
myNumber <- 51
定义可变类型的另一种选择是使用定义存储位置的引用单元格,该存储位置允许您使用引用语义创建可变值。ref运算符声明一个新的引用单元格,它封装了一个值,然后可以使用:=运算符更改该值,并使用!(感叹号)运算符访问该值:
let myRefVar = ref 42
myRefVar := 53
printfn "%d" !myRefVar
第一行声明了具有值 42 的引用单元格myRefVar,第二行将其值更改为 53。在代码的最后一行,访问并打印了基础值。
可变变量和引用单元格可以在几乎相同的情况下使用;但首选可变类型,除非编译器不允许,并且可以使用引用单元格代替。例如,在生成需要可变状态的闭包的表达式中,编译器将报告不能使用可变变量。在这种情况下,引用单元格解决了这个问题。
函数作为一阶类型
在 F#中,函数是一阶数据类型;可以使用let关键字声明,并且可以像任何其他变量一样使用:
let square x = x * x
let plusOne x = x + 1
let isEven x = x % 2 = 0
函数总是返回一个值,即使没有显式的return关键字。函数中执行的最后一个语句的值是返回值。
组合:管道和组合运算符
管道(|>)和组合(>>)运算符用于链接函数和参数,以提高代码可读性。这些运算符以灵活的方式让您建立函数的管道。这些运算符的定义很简单:
let inline **(|>)** x f = f x
let inline **(>>)** f g x = g(f x)
以下示例展示了如何利用这些运算符构建功能管道:
let squarePlusOne x = x **|>** square **|>** plusOne
let plusOneIsEven = plusOne **>>** isEven
在代码的最后一行,组合(>>)运算符让您无需显式定义输入参数。F#编译器理解函数plusOneIsEven期望一个整数作为输入。不需要参数定义的函数称为无参数函数。
管道(|>)和组合(>>)运算符之间的主要区别在于它们的签名和使用。管道运算符接受函数和参数,而组合将函数组合在一起。
委托
在.NET 中,委托是一个指向函数的指针;它是一个包含对具有相同公共签名的函数引用的变量。在 F#中,使用函数值代替委托;但 F#提供了与.NET API 交互的委托支持。这是 F#中定义委托的语法:
type delegate-typename = delegate of typeA -> typeB
以下代码展示了创建具有表示加法操作的签名的委托的语法:
type MyDelegate = delegate of (int * int) -> int
let add (a, b) = a + b
let addDelegate = MyDelegate(add)
let result = addDelegate.Invoke(33, 9)
在示例中,F# 函数 add 直接作为参数传递给委托构造函数 MyDelegate。委托可以附加到 F# 函数值、静态方法或实例方法。委托类型 addDelegate 上的 Invoke 方法调用底层函数 add。
注释
F# 中使用了三种注释类型:块注释位于 (* 和 *) 符号之间,行注释以 // 符号开始,并持续到行尾,XML 文档注释跟在 /// 符号之后,允许您使用 XML 标签根据编译器生成的文件生成代码文档。以下是这些注释的示例:
(* This is block comment *)
// Single line comments use a double forward slash
/// This comment can be used to generate documentation.
打开语句
您可以使用 open 关键字打开一个命名空间或模块,类似于 C# 中的语句。此代码打开 System 命名空间:open System。
基本数据类型
表 B.1 展示了 F# 的 原始 类型列表。
表 B.1 基本数据类型
| F# 类型 | .NET 类型 | 字节大小 | 范围 | 示例 |
|---|---|---|---|---|
sbyte |
System.SByte |
1 | -128 to 127 | 42y |
byte |
System.Byte |
1 | 0 to 255 | 42uy |
int16 |
System.Int16 |
2 | -32,768 to 32,767 | 42s |
uint16 |
System.UInt16 |
2 | 0 to 65,535 | 42us |
int / int32 |
System.Int32 |
4 | -2,147,483,648 to 2,147,483,647 | 42 |
uint32 |
System.UInt32 |
4 | 0 to 4,294,967,295 | 42u |
int64 |
System.Int64 |
8 | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 | 42L |
uint64 |
System.UInt64 |
8 | 0 to 18,446,744,073,709,551,615 | 42UL |
float32 |
System.Single |
4 | ±1.5e-45 to ±3.4e38 | 42.0F |
float |
System.Double |
8 | ±5.0e-324 to ±1.7e308 | 42.0 |
decimal |
System.Decimal |
16 | ±1.0e-28 to ±7.9e28 | 42.0M |
char |
System.Char |
2 | U+0000 to U+ffff | 'x' |
string |
System.String |
20 + (2 * size of string) | 0 to about 2 billion characters | "Hello World" |
bool |
System.Boolean |
1 | 只有两个可能的值:true 或 false | true |
特殊字符串定义
在 F# 中,字符串类型是 System.String 类型的别名;但除了传统的 .NET 语义外,您还可以使用特殊的三个引号的方式来声明字符串。这种特殊的字符串定义允许您声明一个字符串,而无需像文本字面量那样转义特殊字符。以下示例以标准方式和 F# 三个引号方式定义了相同的字符串,以转义特殊字符:
let verbatimHtml = @"<input type="submit" value="Submit">"
let tripleHTML = """<input type="submit" value="Submit">"""
元组
元组 是一组无名称且有序的值,可以是不同类型。元组对于创建临时数据结构很有用,并且是函数返回多个值的便捷方式。元组定义为值的逗号分隔集合。以下是构造元组的方法:
let tuple = (1, "Hello")
let tripleTuple = ("one", "two", "three")
元组也可以被解构。在这里,元组值 1 和 "Hello" 分别绑定到标识符 a 和 b,函数 swap 交换给定元组 (a, b) 中两个值的顺序:
let (a, b) = tuple
let swap (a, b) = (b, a)
元组通常是对象,但也可以定义为值类型结构体,如下所示:
let tupleStruct = struct (1, "Hello")
注意,F#类型推断可以自动将函数泛化为一个泛型类型,这意味着元组可以与任何类型一起工作。可以使用fst和snd函数访问和获取元组的第一个和第二个元素:
let one = fst tuple
let hello = snd tuple
记录类型
记录类型类似于元组,除了字段有名称,并定义为分号分隔的列表。虽然元组提供了一种在单个容器中存储可能异构数据的方法,但当存在多个元素时,解释元素的目的可能会变得困难。在这种情况下,记录类型通过使用名称标记其定义来帮助解释数据的目的。记录类型通过使用type关键字显式定义,并编译为不可变、公共和密封的 .NET 类。此外,编译器自动生成结构相等性和比较功能,并提供一个默认构造函数,该构造函数填充记录中包含的所有字段。
此示例展示了如何定义和实例化一个新的记录类型:
type Person = { FirstName : string; LastName : string; Age : int }
let fred = { FirstName = "Fred"; LastName = "Flintstone"; Age = 42 }
记录可以通过属性和方法进行扩展:
type Person with
member this.FullName = sprintf "%s %s" this.FirstName this.LastName
记录是不可变类型,这意味着记录的实例不能被修改。但你可以通过使用with克隆语义方便地克隆记录:
let olderFred = { fred with Age = fred.Age + 1 }
记录类型也可以使用[<Struct>]属性表示。这在性能至关重要的场景中很有帮助,并覆盖了引用类型的灵活性:
[<Struct>]
type Person = { FirstName : string; LastName : string; Age : int }
区分联合
区分联合(DUs)是一种类型,它表示一组值,这些值可以是几个定义良好的情况之一,每个情况可能具有不同的值和类型。在面向对象范式中,DUs 可以被视为从同一基类继承的一组类。通常,DUs 是构建复杂数据结构、建模领域和表示递归结构(如Tree数据类型)的工具。
以下代码显示了扑克牌的花色和点数:
type Suit = Hearts | Clubs | Diamonds | Spades
type Rank =
| Value of int
| Ace
| King
| Queen
| Jack
static member GetAllRanks() =
[ yield Ace
for i in 2 .. 10 do yield Value i
yield Jack
yield Queen
yield King ]
如您所见,DUs 可以通过属性和方法进行扩展。表示牌组中所有牌的列表可以按以下方式计算:
let fullDeck =
[ for suit in [ Hearts; Diamonds; Clubs; Spades] do
for rank in Rank.GetAllRanks() do
yield { Suit=suit; Rank=rank } ]
此外,DUs 也可以通过具有[<Struct>]属性的结构体来表示。
模式匹配
模式 匹配 是一种语言结构,它使编译器能够解释数据类型的定义并对其应用一系列条件。通过这种方式,编译器强制你编写模式匹配结构,通过覆盖所有可能的案例来匹配给定的值。这被称为 穷举 模式匹配。模式匹配结构用于控制流。它们在概念上类似于一系列 if/then 或 case/switch 语句,但功能更强大。它们允许你在每次匹配期间将数据结构分解为其底层组件,然后对这些值执行某些计算。在所有编程语言中,控制流 指的是代码中做出的决策,这些决策影响应用程序中语句执行的顺序。
通常,最常见的模式涉及代数数据类型,例如区分联合、记录类型和集合。以下代码示例展示了 Fizz-Buzz (en.wikipedia.org/wiki/Fizz_buzz) 游戏的两个实现。第一个模式匹配结构包含一系列条件来测试函数 divisibleBy 的评估。如果任一条件为真或假,第二个实现使用 when 子句,称为 guard,来指定和整合必须成功的附加测试以匹配模式:
let fizzBuzz n =
let divisibleBy m = n % m = 0
match divisibleBy 3,divisibleBy 5 with
| true, false -> "Fizz"
| false, true -> "Buzz"
| true, true -> "FizzBuzz"
| false, false -> sprintf "%d" n
let fizzBuzz n =
match n with
| _ when (n % 15) = 0 -> "FizzBuzz"
| _ when (n % 3) = 0 -> "Fizz"
| _ when (n % 5) = 0 -> "Buzz"
| _ -> sprintf "%d" n
[1..20] |> List.iter fizzBuzz
当模式匹配结构被评估时,表达式被传递到 match <expression>,然后与每个模式进行测试,直到找到第一个匹配。然后评估相应的主体。下划线字符(_)被称为 wildcard,这是始终有正匹配的一种方式。通常,这个模式被用作通用捕获的最终子句,以应用于常见行为。
活跃模式
活跃 模式 是扩展模式匹配功能的结构,允许对给定的数据结构进行分区和分解,从而确保通过使代码更易于阅读并使分解的结果可用于进一步的模式匹配来转换和提取底层值。
此外,活跃模式允许你将任意值包裹在 DU 数据结构中,以便于模式匹配。可以使用活跃模式包裹对象,这样你就可以像使用任何其他联合类型一样轻松地在模式匹配中使用这些对象。
有时活跃模式不会生成值;在这种情况下,它们被称为 部分活跃模式,并导致一个选项类型。要定义部分活跃模式,你需要在用括号和管道字符组合创建的香蕉剪辑 (| |) 内部的模式列表末尾使用下划线通配符字符(_)。以下是一个典型的部分活跃模式的外观:
let (|DivisibleBy|_|) divideBy n =
if n % divideBy = 0 then Some DivisibleBy else None
在此部分主动模式中,如果值 n 能被 divideBy 的值整除,则返回类型为 Some(),这表示主动模式成功。否则,None 返回类型表示模式失败并移动到下一个匹配表达式。部分主动模式用于分区和匹配输入空间的一部分。以下代码演示了如何对部分主动模式进行模式匹配:
let fizzBuzz n =
match n with
| DivisibleBy 3 & DivisibleBy 5 -> "FizzBuzz"
| DivisibleBy 3 -> "Fizz"
| DivisibleBy 5 -> "Buzz"
| _ -> sprintf "%d" n
[1..20] |> List.iter fizzBuzz
此函数使用部分主动模式 (|可被|_|) 来测试输入值 n。如果它能被 3 和 5 整除,则第一个情况成功。如果它只能被 3 整除,则第二个情况成功,依此类推。请注意,& 操作符允许你在同一个参数上运行多个模式。
另一种主动模式是 参数化主动模式,它与部分主动模式类似,但接受一个或多个额外的参数作为输入。
更有趣的是 多情况主动模式,它将整个输入空间划分为不同形状为 DU 的数据结构。以下是一个使用多情况主动模式的 FizzBuzz 示例:
let (|Fizz|Buzz|FizzBuzz|Val|) n =
match n % 3, n % 5 with
| 0, 0 -> FizzBuzz
| 0, _ -> Fizz
| _, 0 -> Buzz
| _ -> Val n
由于主动模式可以将数据从一种类型转换为另一种类型,因此它们非常适合数据转换和验证。主动模式有四种相关的类型:单案例、部分案例、多案例和部分参数化。有关主动模式的更多详细信息,请参阅 MSDN 文档 (mng.bz/Itmw) 和 Isaac Abraham 的 Get Programming with F# (Manning, 2018)。
集合
F# 支持标准 .NET 集合,如数组序列 (IEnumerable)。此外,它还提供了一套不可变的函数式集合:列表、集合和映射。
数组
数组是零基、可变集合,具有固定大小的相同类型元素。由于它们被编译为连续的内存块,因此它们支持快速、随机的元素访问。以下是创建、过滤和投影数组的不同方法:
let emptyArray= Array.empty
let emptyArray = [| |]
let arrayOfFiveElements = [| 1; 2; 3; 4; 5 |]
let arrayFromTwoToTen= [| 2..10 |]
let appendTwoArrays = emptyArray |> Array.append arrayFromTwoToTen
let evenNumbers = arrayFromTwoToTen |> Array.filter(fun n -> n % 2 = 0)
let squareNumbers = evenNumbers |> Array.map(fun n -> n * n)
可以通过使用点操作符 (.) 和方括号 [ ] 来访问和更新数组的元素:
let arr = Array.init 10 (fun i -> i * i)
arr.[1] <- 42
arr.[7] <- 91
数组也可以使用 Array 模块中的函数以各种其他语法创建:
let arrOfBytes = Array.create 42 0uy
let arrOfSquare = Array.init 42 (fun i -> i * i)
let arrOfIntegers = Array.zeroCreate<int> 42
序列(seq)
序列 是同一类型元素的一系列。与 List 类型不同,序列是延迟评估的,这意味着元素可以在需要时计算(仅当需要时)。在不需要所有元素的情况下,这比列表提供了更好的性能。以下是创建、过滤和投影序列的另一种方法:
let emptySeq = Seq.empty
let seqFromTwoToFive = seq { yield 2; yield 3; yield 4; yield 5 }
let seqOfFiveElements = seq { 1 .. 5 }
let concatenateTwoSeqs = emptySeq |> Seq.append seqOfFiveElements
let oddNumbers = seqFromTwoToFive |> Seq.filter(fun n -> n % 2 <> 0)
let doubleNumbers = oddNumbers |> Seq.map(fun n -> n + n)
序列可以使用 yield 关键字来延迟返回成为序列一部分的值。
列表
在 F#中,List集合是一个不可变的、元素类型相同的单链表。一般来说,列表是枚举的好选择,但在性能关键时,不建议用于随机访问和连接。列表使用[ ... ]语法定义。以下是一些创建、过滤和映射列表的示例:
let emptyList = List.empty
let emptyList = [ ]
let listOfFiveElements = [ 1; 2; 3; 4; 5 ]
let listFromTwoToTen = [ 2..10 ]
let appendOneToEmptyList = 1::emptyList
let concatenateTwoLists = listOfFiveElements @ listFromTwoToTen
let evenNumbers = listOfFiveElements |> List.filter(fun n -> n % 2 = 0)
let squareNumbers = evenNumbers |> List.map(fun n -> n * n)
列表使用括号([ ])和分号(;)分隔符来向列表中添加多个项目,使用符号::来添加一个项目,并使用 at 符号运算符(@)来连接两个给定的列表。
集合
一个集合是基于二叉树的集合,其中元素类型相同。使用集合时,不保留插入顺序,也不允许重复。集合是不可变的,并且更新其元素的操作都会创建一个新的集合。以下是一些创建集合的不同方法:
let emptySet = Set.empty<int>
let setWithOneItem = emptySet.Add 8
let setFromList = [ 1..10 ] |> Set.ofList
映射
一个映射是一个不可变的、具有相同类型的元素集合的键值对。这个集合将值与键关联起来,并且它的行为类似于不允许重复或不尊重插入顺序的Set类型。以下示例展示了如何以不同的方式实例化映射:
let emptyMap = Map.empty<int, string>
let mapWithOneItem = emptyMap.Add(42, "the answer to the meaning of life")
let mapFromList = [ (1, "Hello"), (2, "World") ] |> Map.ofSeq
循环
F#支持循环结构来遍历如列表、数组、序列、映射等可枚举集合。while...do表达式在指定的条件为真时执行迭代:
let mutable a = 10
while (a < 20) do
printfn "value of a: %d" a
a <- a + 1
for...to表达式在循环中遍历一个循环变量的值集合:
for i = 1 to 10 do
printf "%d " i
for...in表达式在循环中遍历值集合中的每个元素:
for i in [1..10] do
printfn "%d" i
类和继承
如前所述,F#支持像其他.NET 编程语言一样的 OOP 结构。实际上,可以定义类对象来模拟现实世界的领域。在 F#中用于声明类的type关键字可以公开属性、方法和字段。以下代码展示了从Person类继承的子类Student的定义:
type Person(firstName, lastName, age) =
member this.FirstName = firstName
member this.LastName = lastName
member this.Age = age
member this.UpdateAge(n:int) =
Person(firstName, lastName, age + n)
override this.ToString() =
sprintf "%s %s" firstName lastName
type Student(firstName, lastName, age, grade) =
inherit Person(firstName, lastName, age)
member this.Grade = grade
属性FirstName、LastName和Age作为字段公开;方法UpdateAge返回一个新的Person对象,其Age已修改。可以使用override关键字更改从基类继承的方法的默认行为。在示例中,ToString基方法被重写以返回全名。
对象Student是通过inherit关键字定义的子类,它从基类Person继承其成员,并添加了自己的成员Grade。
抽象类和继承
一个抽象类是一个提供定义类的模板的对象。通常它暴露一个或多个不完整的方法或属性实现,并要求你创建子类来填充这些实现。但是可以定义默认行为,这可以被覆盖。在下面的例子中,抽象类Shape定义了Rectangle和Circle类:
[<AbstractClass>]
type Shape(weight :float, height :float) =
member this.Weight = weight
member this.Height = height
abstract member Area : unit -> float
default this.Area() = weight * height
type Rectangle(weight :float, height :float) =
inherit Shape(weight, height)
type Circle(radius :float) =
inherit Shape(radius, radius)
override this.Area() = radius * radius * Math.PI
AbstractClass属性通知编译器该类有抽象成员。Rectangle类使用Area方法的默认实现,而Circle类则使用自定义行为覆盖它。
接口
接口代表了一个定义类实现细节的合约。但在接口声明中,成员没有被实现。接口提供了一种抽象的方式来引用它公开的公共成员和函数。在 F#中,要定义一个接口,使用abstract关键字声明成员,后跟它们的类型签名:
type IPerson =
abstract FirstName : string
abstract LastName : string
abstract FullName : unit -> string
类实现的接口方法是通过接口定义而不是通过类的实例来访问的。因此,要调用接口方法,需要对类应用一个:>(向上转换)操作符:
type Person(firstName : string, lastName : string) =
interface IPerson with
member this.FirstName = firstName
member this.LastName = lastName
member this.FullName() = sprintf "%s %s" firstName lastName
let fred = Person("Fred", "Flintstone")
(fred :> IPerson).FullName()
对象表达式
接口代表了可以在程序的其他部分之间共享的有用代码实现。但可能需要繁琐的工作来定义通过创建新类实现的特定接口。一种解决方案是使用对象表达式,它允许您通过使用匿名类即时实现接口。以下是一个创建新对象以实现IDisposable接口并将颜色应用到控制台然后恢复原始状态的示例:
let print color =
let current = Console.ForegroundColor
Console.ForegroundColor <- color
**{** **new** IDisposable **with**
member x.Dispose() =
Console.ForegroundColor <- current
** }**
using(print ConsoleColor.Red) (fun _ -> printf "Hello in red!!")
using(print ConsoleColor.Blue) (fun _ -> printf "Hello in blue!!")
类型转换
将原始值转换为对象类型的过程称为装箱,它通过box函数应用。这个函数将任何类型向上转换为.NET 的System.Object类型,在 F#中用名称 obj 缩写。
向上转换函数应用于类和接口层次结构,它从类向上到继承的类。语法是expr :> type。转换的成功在编译时进行检查。
向下转换函数用于应用一个“向下”类或接口层次结构的转换:例如,从一个接口到一个实现类。语法是expr :?> type,其中操作符内的问号表示该操作可能因InvalidCastException而失败。在应用向下转换之前,安全地比较和测试类型。这可以通过类型测试操作符:?来实现,它在 C#中相当于is操作符。匹配表达式如果值匹配给定的类型则返回 true;否则返回 false:
let testPersonType (o:obj) =
match o with
| o :? IPerson -> printfn "this object is an IPerson"
| _ -> printfn "this is not an IPerson"
度量单位
单位 度量(UoM)是 F#类型系统的独特特性,它提供了定义上下文和为静态类型单位元数据注释的能力,并将其应用于数值字面量。这是一种方便的方式来操作表示特定度量单位(如米、秒、磅等)的数字。F#类型系统首先检查 UoM 是否正确使用,从而消除运行时错误。例如,如果 F#编译器期望一个float<mil>,而使用了float<m/sec>,则会抛出错误。此外,可以将特定函数与定义的 UoM 关联起来,该函数在单位上执行工作而不是在数值字面量上。在此,代码展示了如何定义米(m)和秒(sec)UoM,然后执行一个计算速度的操作:
[<Measure>]
type m
[<Measure>]
type sec
let distance = 25.0<m>
let time = 10.0<sec>
let speed = distance / time
事件模块 API 参考
事件模块提供了管理事件流的函数。表 B.1 列出了来自在线 MSDN 文档的 API 参考(mng.bz/a0hG)。
表 B.2 API 参考
| 函数 | 描述 |
|---|
| add :
`('T -> unit) -> Event<'Del,'T> -> unit`
| 每次事件触发时运行函数。 |
|---|
| choose :
`('T -> 'U option) -> IEvent<'Del,'T> -> IEvent<'U>`
| 返回一个新事件,该事件在原始事件的消息选择上触发。选择函数将原始消息转换为可选的新消息。 |
|---|
| filter :
`('T -> bool) -> IEvent<'Del,'T> -> IEvent<'T>`
| 返回一个新事件,该事件监听原始事件,并且仅在事件传递给给定函数的参数通过时触发结果事件。 |
|---|
| map :
`('T -> 'U) -> IEvent<'Del, 'T> -> IEvent<'U>`
| 返回一个新事件,该事件通过给定函数转换值。 |
|---|
| merge :
`IEvent<'Del1,'T> -> IEvent<'Del2,'T> -> IEvent<'T>`
| 当任一输入事件触发时,触发输出事件。 |
|---|
pairwise :IEvent<'Del,'T> -> IEvent<'T * 'T> |
| partition :
`('T -> bool) -> IEvent<'Del,'T> -> IEvent<'T> * IEvent<'T>`
| 返回一对事件,它们监听原始事件。当原始事件触发时,根据谓词的结果,触发这对中的第一个或第二个事件。 |
|---|
| scan :
`('U -> 'T -> 'U) -> 'U -> IEvent<'Del,'T> -> IEvent<'U>`
``` | 返回一个新事件,该事件由将给定累积函数应用于输入事件上连续触发的值的结果组成。内部状态项记录状态参数的当前值。在累积函数执行期间,内部状态不会被锁定,因此应小心,确保输入`IEvent`不会被多个线程同时触发。 |
| split :
('T -> Choice<'U1,'U2>) -> IEvent<'Del,'T> -> IEvent<'U1> * IEvent<'U2>
| 返回一个新事件,该事件监听原始事件,并在将函数应用于事件参数返回`Choice1Of2`时触发第一个结果事件,如果返回`Choice2Of2`则触发第二个事件。 |
| --- |
## **了解更多**
关于学习 F# 的更多信息,我推荐 Isaac Abraham 的 *《用 F# 编程:.NET 开发者指南》*(Manning,2018,[www.manning.com/books/get-programming-with-f-sharp](http://www.manning.com/books/get-programming-with-f-sharp))。
# C
F# 异步工作流与 .NET Task 之间的互操作性
尽管 C# 和 F# 编程语言公开的异步编程模型之间存在相似之处,但它们的互操作性并非易事。F# 程序倾向于使用比 .NET Task 更多的异步计算表达式。这两种类型相似,但它们确实存在语义差异,如第七章和第八章所示。例如,任务在其创建后立即开始,而 F# 的 `Async` 必须显式启动。
你如何实现 F# 异步计算表达式与 .NET Task 之间的互操作性?可以使用 F# 函数,如 `Async.StartAsTask<T>` 和 `Async.AwaitTask<T>`,与返回或等待 `Task` 类型的 C# 库进行互操作。
相反,没有将 F# 的 `Async` 转换为 `Task` 类型的等效方法。在 C# 中使用内置的 F# `Async.Parallel` 计算会很有帮助。在此列表中,重复第九章的内容,F# 的 `downloadMediaAsyncParallel` 函数异步并行地从 Azure Blob 存储下载图像。
列表 C.1 `Async` 并行函数,用于从 Azure Blob 存储下载图像
let downloadMediaAsyncParallel containerName = async {
let storageAccount = CloudStorageAccount.Parse(azureConnection)
let blobClient = storageAccount.CreateCloudBlobClient()
let container = blobClient.GetContainerReference(containerName)
let computations =
container.ListBlobs()
|> Seq.map(fun blobMedia -> async {
let blobName = blobMedia.Uri.Segments.
[blobMedia.Uri.Segments.Length - 1]
let blockBlob = container.GetBlockBlobReference(blobName)
use stream = new MemoryStream()
do! blockBlob.DownloadToStreamAsync(stream)
let image = System.Drawing.Bitmap.FromStream(stream)
return image })
return! Async.Parallel computations } ①
`downloadMediaAsyncParallel` 的返回类型是 `Async<Image[]>`。如前所述,F# 的 `Async` 类型通常难以互操作,并从 C# 代码中作为任务(`async/await`)执行。在下面的代码片段中,C# 代码使用 `Async.Parallel` 操作符将 F# 的 `downloadMediaAsyncParallel` 函数作为 `Task` 运行:
var cts = new CancellationToken();
var images = await downloadMediaAsyncParallel("MyMedia").AsTask(cts);
在 `AsTask` 扩展方法的帮助下,代码互操作性变得毫不费力。互操作性解决方案是实现一个工具 F# 模块,该模块公开了一组扩展方法,这些方法可以被其他 .NET 语言消费。
列表 C.2 用于互操作 `Task` 和异步工作流的辅助扩展方法
module private AsyncInterop =
let asTask(async: Async<'T>, token: CancellationToken option) =
let tcs = TaskCompletionSource<'T>() ①
let token = defaultArg token Async.CancellationToken ②
Async.StartWithContinuations(async, ③
tcs.SetResult, tcs.SetException,
tcs.SetException, token)
tcs.Task ④
let asAsync(task: Task, token: CancellationToken option) =
Async.FromContinuations( ⑤
fun (completed, caught, canceled) ->
let token = defaultArg token Async.CancellationToken ②
task.ContinueWith(new Action
if task.IsFaulted then caught(task.Exception)
else if task.IsCanceled then
canceled(new OperationCanceledException(token)|>raise)
else completed()), token) ⑦
|> ignore)
let asAsyncT(task: Task<'T>, token: CancellationToken option) =
Async.FromContinuations( ⑤
fun (completed, caught, canceled) ->
let token = defaultArg token Async.CancellationToken ②
task.ContinueWith(new Action<Task<'T>>(fun _ -> ⑥
if task.IsFaulted then caught(task.Exception)
else if task.IsCanceled then
canceled(OperationCanceledException(token) |> raise)
else completed(task.Result)), token) ⑦
|> ignore)
[
type AsyncInteropExtensions =
[
static member AsAsync (task: Task<'T>) = AsyncInterop.asAsyncT
➥
(task, None) ⑧
[
static member AsAsync (task: Task<'T>, token: CancellationToken) =
AsyncInterop.asAsyncT (task, Some token) ⑧
[
static member AsTask (async: Async<'T>) = AsyncInterop.asTask
➥
(async, None) ⑧
[
static member AsTask (async: Async<'T>, token: CancellationToken) =
AsyncInterop.asTask (async, Some token) ⑧
`AsyncInterop` 模块是私有的,但允许 F# 的 `Async` 和 C# 的 `Task` 之间互操作的核心函数通过 `AsyncInteropExtensions` 类型公开。属性 `Extension` 将方法升级为扩展,使其可由其他 .NET 编程语言访问。
`asTask` 方法将 F# 的 `Async` 类型转换为 `Task` 类型,并通过 `Async.StartWithContinuations` 函数启动异步操作。内部,此函数使用 `TaskCompletionSource` 返回一个 `Task` 实例,该实例维护操作的状态。当操作完成时,返回的状态可以是取消、异常,或者如果成功,则是实际的结果。
函数 `asAsync` 的目的是将 `Task` 转换为 F# 的 `Async` 类型。此函数使用 `Async.FromContinuations` 创建异步计算,该计算提供回调,该回调将执行给定的成功、异常或取消的其中一个后续操作。
所有这些函数都将一个可选的 `CancellationToken` 作为第二个参数,该参数可用于停止当前操作。如果没有提供令牌,则默认情况下将分配上下文中的 `DefaultCancellationToken`。
这些函数提供了 .NET TPL 的基于任务的异步模式 (TAP) 与 F# 异步编程模型之间的互操作性。


浙公网安备 33010602011771号