A. Verma: Deadline-based workload management for MapReduce enviroment

Key Tech: allocate a tailored number of map/reduce slots to job, when the job profiling info is available

New idea:

1.Makespan theorem: two bound for the makespan of greedy task assignment

　　　　　　lower bound = n.avg/k

　　　　　　uppper bound: (n-1).avg/k + max

　　　　　　where

　　　　　　avg : average duration of n tasks

　　　　　　max: maximum duration of n tasks

* these bounds are paiticularly useful when max << n.avg/k

2. allocate the minimal resource quota required for meeting the constraint, while leave the remainning, spare resource to the future arriving jobs.

where N-num of tasks; S-num of slots; M-durations

--Minimal combinations of map/reduce slots (Sm, Sr)

　　Sm = min(Nm,Sm)

　　Sr - slove the equation c)

3. decide a new job to wait , if it can be complete in time. Otherwise, calcuate how many the slots should cancel processing their tasks.

4. job profiling: use its past executions or execute it on a smaller data set

Experiment:

1. Twitter: data-an edgelist of twitter uerids; computation-counts the number of asymmetirc links in dataset.

2. profiling info :Using StatAssist(tool), identify the statistical distributions which best fits the plot, eg, LogNormal, Gamma, Exponential....

3. Deadline/completionTime uniformly distribute in the interval [T, 2T], T is completionTime given all the cluster resource

4. disable speculation

Related work

1. FLEX (J. Wolf)-

pros: a speedup function that produces the job execution time as a function of the allocated slots

cons: not clear for different sizes of input datasets

2. Flow shop model (B. Moseley)-

pros: formalize scheduling as a genelized version of the classical two-stage flexible flow shop problem with identical machines

　　　　minimze the makespan of jobs offline and online

3. ParaTimer (K. Morton)-

　　pros: estimate the progress of parallel queries expressed as Pig scripts that can translate into DAGs of Mapreduce jobs

　　cons: map/reduce tasks of the same job have the same duration

posted on 2012-06-29 22:50 xiaoshier 阅读(140) 评论(0) 收藏举报

刷新页面返回顶部

xiaoshier