A. Verma: Deadline-based workload management for MapReduce enviroment

Key Tech: allocate a tailored number of map/reduce slots to job, when the job profiling info is available

 

New idea:

1.Makespan theorem: two bound for the makespan of greedy task assignment

      lower bound = n.avg/k

      uppper bound:  (n-1).avg/k + max

      where

      avg : average duration of n tasks

      max: maximum duration of n tasks

* these bounds are paiticularly useful when max << n.avg/k

2. allocate the minimal resource quota required for meeting the constraint, while leave the remainning, spare resource to the future arriving jobs.

a)

b)

c)

where N-num of tasks; S-num of slots; M-durations

--Minimal combinations of map/reduce slots (Sm, Sr)

  Sm = min(Nm,Sm)

  Sr - slove the equation c)

 

3. decide a new job to wait , if it can be complete in time. Otherwise, calcuate how many the slots should cancel processing their tasks.

4. job profiling: use its past executions or execute it on a smaller data set

 

Experiment:

1. Twitter: data-an edgelist of twitter uerids; computation-counts the number of asymmetirc links in dataset.

2. profiling info :Using StatAssist(tool), identify  the statistical distributions which best fits the plot, eg, LogNormal, Gamma, Exponential....

3. Deadline/completionTime uniformly distribute in the interval [T, 2T], T is completionTime given all the cluster resource

4. disable speculation

 

Related work

1. FLEX (J. Wolf)-

pros: a speedup function that produces the job execution time as a function of the allocated slots

cons: not clear for different sizes of input datasets

2. Flow shop model (B. Moseley)-

pros: formalize scheduling as a genelized version of the classical two-stage flexible flow shop problem with identical machines

    minimze the makespan of jobs offline and online

3. ParaTimer (K. Morton)-

  pros: estimate the progress of parallel queries expressed as Pig scripts that can translate into DAGs of Mapreduce jobs

  cons: map/reduce tasks of the same job have the same duration

 

posted on 2012-06-29 22:50  xiaoshier  阅读(140)  评论(0)    收藏  举报

导航