0924

{done}GTD190023:【翻译】Dynamic Programming: First Principles

http://www.flawlessrhetoric.com/Dynamic-Programming-First-Principles

 

Dynamic Programming: First Principles

AlgorithmOct 24, 2017

Dynamic Programming

Introduction

Dynamic Programming is a mathematical tool for finding the optimal algorithm of a problem, often employed in the realms of computer science. 

During the autumn of 1950, Richard Bellman, a tenured professor from Stanford University began working for RAND (Research and Development) Corp, whom suggested he begin work on multistage decision processes. During the 1950s, Charles Erwin Wilson was the Secretary of Defence for the United States, and as RAND was employed by the U.S Air Force, Wilson was virtually their boss. Wilson, originally an electrical engineer, ironically had a hatred for the word ‘research’, and so Bellman had to come up with an alternative name to hide the fact that he was purely researching mathematics. To convey that Bellman’s research was about planning and decision making, he used the word ‘programming’, and prefixed it with ‘dynamic’, to convey that it was multistage and had time variances [1]. Bellman, due to the aforementioned recommendations by RAND, developed an approach that would break down a large or complex problem into a series of smaller problems. Through solving these smaller problems, the optimal solution to the overall problem is discovered. 

The aim of Dynamic Programming is the use of past values for the procurement of a solution to a problem. This is done to avoid unnecessary computation of the same calculation that problems may contain, optimising, enabling computer algorithms to run as efficiently as possible. It aims to find the optimal substructure of a problem (if it exists), and to eliminate any occurrences of overlapping sub problems. 

Dynamic Programming works in contrast to Linear Programming. Whilst it is concerned with functional relations and multi-stage decision processes, Linear Programming is a method used to achieve the best result based upon linear relationships [2].

介绍

动态编程是一种数学工具,用于找到常用于计算机科学领域的问题的最优算法。

1950年秋季,斯坦福大学终身教授理查德·贝尔曼(Richard Bellman)开始为兰德(研发)公司工作,他建议他开始多阶段决策过程工作。五十年代期间,查尔斯·埃尔温·威尔逊(Charles Erwin Wilson)是美国的国防部长,当兰德被美国空军聘用时,威尔逊实际上是他们的老板。威尔逊,原来是一名电气工程师,讽刺地对“研究”一词抱有仇恨,所以贝尔曼不得不提出一个替代名字,以隐藏他纯粹研究数学的事实。为了表明贝尔曼的研究是关于规划和决策的,他使用了“编程”一词,并以“动态”为前缀,表达了它是多层次的,有时间差异[1]。贝尔曼,由于兰德提出的上述建议,开发出一种将大型或复杂问题分解为一系列较小问题的方法。通过解决这些较小的问题,发现了整体问题的最优解。

动态规划的目的是使用过去的价值来采购解决问题的方法。这样做是为了避免不必要的计算问题可能包含的相同计算,优化,使计算机算法尽可能高效地运行。它旨在找到问题的最佳子结构(如果存在),并消除任何重叠的子问题的发生。

动态编程与线性规划相反。虽然它涉及功能关系和多阶段决策过程,但线性规划是一种基于线性关系实现最佳结果的方法[2]。

Optimal Substructures

When Dynamic Programming is applied, breaking up a problem and reconstructing it, we often find the optimal substructure of a problem. A solution is said to have an optimal substructure if it can be defined based upon optimal solutions of its sub problems. 

For example, the shortest path problem has the optimal substructure property. If we have three sequentially placed locations, X, Y, and Z, and we wish to find the optimal distance of X to Z. The solution can be defined as the sum of the optimal solutions between X to Y, and Y to Z, therefore X to Z can be defined based upon the optimal solutions of it’s sub problems. 

最优子结构

当应用动态规划时,分解问题并进行重构,我们经常会发现问题的最优子结构。 如果可以根据其子问题的最佳解决方案来定义一个解决方案,则具有最佳子结构。

例如,最短路径问题具有最佳子结构属性。 如果我们有三个顺序放置的位置X,Y和Z,并且我们希望找到X到Z的最佳距离。解可以定义为X到Y和Y到Z之间的最优解的和, 因此X到Z可以根据其次问题的最优解决定。

Overlapping Sub-Problems

A problem or function contains overlapping problems if it can be broken down into a series of smaller problems, and some are duplicates. The need for the problem to compute the same calculation many times over can cause large increases in the running time of the problem. Dynamic Programming aims to remove the need to compute the same calculations multiple times. 

重叠子问题

问题或功能如果可以分解成一系列较小的问题,则会包含重叠的问题,有些则是重复的。 问题需要多次计算相同的计算可能会导致问题的运行时间的大幅度增加。 动态编程旨在消除多次计算相同计算的需求。

Memoisation

Memoisation , also known as caching, is a technique used in Computer Science, which stores the results of functions and calculations, and uses them if the calculations are needed again. Due to the fact that memory is finite, memoisation is not feasible in all situations, and thus, needs to be used appropriately and sparingly, especially when it comes to large applications. Tail recursion [3], a variant of traditional recursion implements memoisation, which uses memoisation very economically. 

缓存

Memoisation也称为缓存,是计算机科学中使用的一种技术,它存储函数和计算结果,并在需要再次计算的情况下使用它们。 由于记忆是有限的,记忆在所有情况下都是不可行的,因此需要适当和谨慎地使用,特别是在涉及大量应用时。 尾递归[3],传统递归的变体实现了记忆,非常经济地使用记忆。

Fibonacci: An elementary use of Dynamic Programming

One of the simplest problems that can be optimised with a Dynamic Programming approach is the Fibonacci number. The Fibonacci number (also known as the Fibonacci sequence) is a series of numbers where the leading number is the sum of the previous two numbers (modern interpretation) [4]. 

0,1,1,2,3,5,8,13,21,34…

斐波纳契:动态规划的基本使用

使用动态编程方法可以优化的最简单的问题之一就是斐波纳契数字。 斐波纳契数(也称为斐波那契序列)是一系列数字,其中前导数是前两个数字(现代解释)的总和[4]。

It is named after Italian mathematician Leonardo of Pisa (commonly known as Fibonacci), whom described the sequence in his book Liber Abaci during 1202 AD. It had previously been described in Indian Mathematics, but had not yet encountered the western world [5]. 

Fibonacci can be described as follows; 

它以意大利数学家莱昂纳多比萨(俗称斐波纳契)命名,他在公元1202年在他的书“自由阿巴奇”中描述了序列。 以前曾在印度数学中描述过,但尚未遇到西方世界[5]。

斐波那契可以描述如下;

F0 = 0
F1 = 1
Fn = F(n-1) + F(n-2),

For example; 

F3 = F(2)  + F(1)
F3 = (F(1) + F(0))  + 1
F3 = (1 + 0) + 1
F3 = 2

This poses a problem. Computing a Fibonacci number greater than two will force overlapping sub problems.fibonacci of four

The figure above represents the structure of the Fibonacci sequence of four. The Fibonacci number of four will compute the Fibonacci of two twice. The number of overlapping sub problems will grow exponentially as the Fibonacci number is increased, and thus the running time to compute it. 

We can use Dynamic Programming (and memoisation) to mitigate these unneeded computations, and define the Fibonacci Number of n as follows; 

这是一个问题。 计算大于2的斐波纳契数将迫使重叠的子问题。斐波纳契四

上图表示四个斐波纳契序列的结构。 斐波那契数四人将计算斐波纳契两次两次。 随着斐波纳契数量的增加,重叠子问题的数量将呈指数增长,从而计算运行时间。

我们可以使用动态编程(和记忆)来减轻这些不必要的计算,并且如下定义n的斐波那契数:

Fibonacci (n, i = 0, a = 0, b = 1 )
{
      if ( i < n )
        Fibonacci( n, i+1, b, a+b )
      return b;
}

This approach first checks if we have reached the desired number, if not, it computes the sequence. It will never call a number that has already been calculated, and will instead use memoisation and pass the previous (to the current number we are at) two values to the function to calculate them.

这种方法首先检查我们是否已经达到所需的数字,否则,它将计算序列。 它将永远不会调用已经计算的数字,而是使用memoization并将前一个(到当前数字)传递给函数来计算它们的两个值。

Economic Optimisation: A Canonical example of dynamic programming.

Various economic problems, especially around stage based decision making can be solved through the use of dynamic programming. 

Michael Tick describes the following problem. A corporation owning a number of plants has funds to invest, with each of the plants providing a number of proposals of what returns it can provide depending on how many funds it is allocated. This problem can be solved quite nicely with dynamic programming [6]. 

The total number of solutions is the number of proposals in each plant multiplied, in Ticks case there are 24 possible solutions (3 x 4 x 2). Enumerating of all solutions is infeasible due to the fact that many solutions may not be possible with the funds available, and are not worth calculation, and for problems containing lots of proposals, it may not be computationally feasible. Enumerating over all solutions is also not efficient, we do not look at previous solutions to eliminate possible, inferior (less generated revenue) solutions. 

This is where dynamic programming can come into use. We can employ it to develop a solution that will take into consideration the available funds as well as previously tested solutions. 

经济优化:动态规划的典型例子

各种经济问题,特别是围绕基于阶段的决策可以通过使用动态规划来解决。

Michael Tick描述了以下问题。一家拥有多家工厂的公司有资金进行投资,每个工厂都提供了一些可以根据其分配的资金提供哪些回报的方案。这个问题可以很好地解决动态规划问题[6]。

解决方案的总数是每个工厂中的提案数量相乘,在Ticks案例中有24个可能的解决方案(3 x 4 x 2)。列举所有解决方案是不可行的,因为许多解决方案可能无法使用可用的资金,不值得计算,对于包含大量提案的问题,可能无法在计算上可行。列举所有解决方案也不是有效的,我们不考虑以前的解决方案,以消除可能的,较差的(较少的收入)解决方案。

这就是动态规划可以使用的地方。我们可以利用它来开发一个考虑到可用资金以及以前测试的解决方案的解决方案。

Solution Design

A solution was developed using dynamic programming to approach this problem, written in C++.

解决方案设计

使用动态编程开发了一种解决方案来解决这个问题,用C ++编写。

Proposals

Proposals contain three integers, an index (their position in the plant), as well as their cost, and revenue. These are only set in the constructor and can be accessed but not changed.

建议

提案包含三个整数,一个指数(它们在工厂中的位置)以及它们的成本和收入。 这些只能在构造函数中设置,可以访问但不能更改。

Plant

Each plant contains a vector of proposals, which it creates when the constructor is invoked. It can return a reference to the vector of it’s proposals, as well as the number of proposals it contains.

每个工厂都包含一个建议的向量,它在调用构造函数时创建。 它可以返回对其提案的向量的引用,以及它包含的提议数量。

Stage

This class contains the algorithm. Each state maintains a list of proposals (from the given plant), as well as a map, it’s key the cost, and value the revenue it generates. The stage takes a previous stage (or null if it is the first). 

If it is the first stage, then revenue is just the possible proposals that cost less than the funds available. If we are on any of the stages after the initial, we begin iterating through each of the proposals. If the cost of that proposal is less than or equal to the available funds, then we assign the revenue to a temporary variable, and if any funds remain, we access the revenue from the previous stage for the remaining funds, adding it to the temporary variable. It is then checked against the previous stage’s revenue for that cost. If it is greater, then it is the new revenue for that cost, if not, we get use the previous stage’s revenue for that cost.

阶段

该类包含算法。 每个州保留一份提案清单(来自给定的工厂)以及一张地图,它是成本的关键,并对其产生的收入进行估价。 舞台需要前一个阶段(如果是第一个舞台,则为null)。

如果是第一阶段,那么收入只是可能的成本低于可用资金的建议。 如果我们在初始阶段之后的任何一个阶段,我们开始迭代每个提案。 如果该提案的成本小于或等于可用资金,则我们将收入分配给临时变量,如果任何资金仍然存在,则我们从剩余资金的前一阶段获得收入,将其添加到临时 变量。 然后根据该成本检查前一阶段的收入。 如果是更大的话,那么这个成本是新的收入,如果不是,我们可以用这个代价来使用前一阶段的收入。

UML

This solution uses Tail Recursion. Each stage does its necessary calculations, then passes its information to the next stage which is the natural solution to the problem. Each stage will asses it’s plant’s proposals compared to the available funds, and then pass its information to the next stage. Whilst the approach finds the best solution, it does not return the proposals used, only the possible revenue and the cost it will require. The map of cost to revenue, implements the memoisation aspect of the algorithm. 

此解决方案使用尾递归。 每个阶段都进行必要的计算,然后将其信息传递给下一个阶段,这是解决问题的自然解决方案。 与现有资金相比,每个阶段都会对工厂的建议进行评估,然后将其信息传递到下一阶段。 虽然该方法找到最佳解决方案,但它不会返回所使用的提案,只能获得可能的收入和成本。 成本收益图,实现了算法的记忆方面。

if first stage
  foreach proposal in plant
    if proposal cost <= available funds
      if Revenue[proposal cost] < revenue of new proposal
        Revenue[proposal cost] = revenue of new proposal
else
  foreach proposal in plant
    if proposal cost <= available funds
      revenue = proposal revenue
      remaining funds = funds - proposal cost
      if remaining funds > 0
        (revenue-1,cost-1) = max Stage-1[remaining funds]
      revenue += revenue-1
      cost = propsal cost + cost-1
      if Revenue[cost] < revenue
        Revenue[cost] = revenue



Whilst solvable, Tick makes several assumptions. Firstly, his solution assumes that each plant will have a proposal enacted upon, and secondly, that we use all the funds with the remainder not included in the revenue created in the end. 

虽然可以解决,Tick做了几个假设。 首先,他的解决方案假设每个工厂都会有一个提出的建议,其次,我们使用所有的资金,剩下的资金不包括在最终创造的收入中。

Conclusion

Dynamic Programming is an essential tool for solving multistage problems. It helps find the optimal substructure of a problem where possible, and remove any overlapping sub problems. It is applicable for smaller problems, such as Fibonacci, and larger problems such as economic optimisations. This report is by no means an extensive or advanced approach to dynamic programming, and aims only to introduce the concept of dynamic programming, and explain how it works on a rudimentary example.

结论

动态编程是解决多级问题的重要工具。 它有助于在可能的情况下找到问题的最佳子结构,并删除任何重叠的子问题。 适用于斐波那契等较小问题,经济优化等较大问题。 本报告绝不是对动态规划的广泛或先进的方法,目的只是介绍动态规划的概念,并以简单的例子解释它的工作原理。

References

  • [1] R. Bellman, Eye of the hurricane. Singapore: World Scientific, 1984
  • [2] S. Dreyfus, “A Comparison of Linear Programming and Dynamic Programming”, RAND, 1956.
  • [3] D. Friedman and M. Want, Essentials of Programming Languages. MIT Press, 1999.
  • [4] “Fibonacci number”, En.wikipedia.org, 2017. [Online]. Available: https://en.wikipedia.org/wiki/Fibonacci_number. [Accessed: 22- Oct- 2017].
  • [5] L. Fibonacci and L. Sigler, Fibonacci’s Liber abaci. New York: Springer, 2003.
  • [6] M. Tick, “A Tutorial on Dynamic Programming”, Mat.gsia.cmu.edu, 2017. [Online]. Available: http://mat.gsia.cmu.edu/classes/dynamic/dynamic.html. [Accessed: 30- Oct- 2015]. 

The source code can be found here.

posted on 2017-10-25 11:34  0924  阅读(333)  评论(0编辑  收藏  举报

导航