2010-12-10 113 views
16

比方说,我有这样配置的触发器:石英重试时出现故障

<bean id="updateInsBBTrigger"   
    class="org.springframework.scheduling.quartz.CronTriggerBean"> 
    <property name="jobDetail" ref="updateInsBBJobDetail"/> 
    <!-- run every morning at 5 AM --> 
    <property name="cronExpression" value="0 0 5 * * ?"/> 
</bean> 

触发器必须与其他应用程序连接,如果有任何问题(如连接失败),它应该重试任务每10分钟最多五次或直到成功。有什么办法可以像这样配置触发器?

回答

16

来源Automatically Retry Failed Jobs in Quartz

如果你想拥有它不停地尝试了一遍又一遍,直到成功为止一份工作,你所要做的就是抛出JobExecutionException具有标志告诉调度火它再次失败时。以下代码显示了如何:

class MyJob implements Job { 

    public MyJob() { 
    } 

    public void execute(JobExecutionContext context) throws JobExecutionException { 

     try{ 
      //connect to other application etc 
     } 
     catch(Exception e){ 

      Thread.sleep(600000); //sleep for 10 mins 

      JobExecutionException e2 = new JobExecutionException(e); 
      //fire it again 
      e2.setRefireImmediately(true); 
      throw e2; 
     } 
    } 
} 

如果您想重试一定次数,它会变得更复杂一点。您必须使用StatefulJob并在其JobDataMap中保存一个retryCounter,如果作业失败,您将增加该值。如果计数器超过最大重试次数,则可以根据需要禁用该作业。

class MyJob implements StatefulJob { 

    public MyJob() { 
    } 

    public void execute(JobExecutionContext context) throws JobExecutionException { 
     JobDataMap dataMap = context.getJobDetail().getJobDataMap(); 
     int count = dataMap.getIntValue("count"); 

     // allow 5 retries 
     if(count >= 5){ 
      JobExecutionException e = new JobExecutionException("Retries exceeded"); 
      //make sure it doesn't run again 
      e.setUnscheduleAllTriggers(true); 
      throw e; 
     } 


     try{ 
      //connect to other application etc 

      //reset counter back to 0 
      dataMap.putAsString("count", 0); 
     } 
     catch(Exception e){ 
      count++; 
      dataMap.putAsString("count", count); 
      JobExecutionException e2 = new JobExecutionException(e); 

      Thread.sleep(600000); //sleep for 10 mins 

      //fire it again 
      e2.setRefireImmediately(true); 
      throw e2; 
     } 
    } 
} 
+0

谢谢。这是我正在寻找。 – Averroes 2010-12-13 15:42:14

+43

-1,我不推荐这种方法 - 它会阻塞一个Quartz工作线程10分钟。正确的做法是促进现有的Quartz功能 - 告诉它以某种方式在10分钟后重新运行相同的工作 - 毕竟,这是它的目的。如果我们要运行一些代码并进行睡眠,首先使用Quartz就没有意义了。 – 2012-03-05 15:14:17

+1

在Quartz 2.0中补充说明(至少对于.net)。该StatefulJob由'PersistJobDataAfterExecutionAttribute' http://quartznet.sourceforge.net/apidoc/2.0/html/html/babe3560-218c-38de-031a-7fe1fdd569d2.htm – ossek 2014-03-24 19:08:45

7

我会建议为你的DB两个偏移更大的灵活性和可配置性,以更好地存储:在repeatOffset它会告诉你多久后的工作应该重试和trialPeriodOffset将保持 作业为 的时间窗信息允许重新安排。然后

String repeatOffset = yourDBUtilsDao.getConfigParameter(..); 
String trialPeriodOffset = yourDBUtilsDao.getConfigParameter(..); 

,而不是工作,注意,反将需要记住initalAttempt:然后你可以检索这两个参数,如(我假设你正在使用Spring)

Long initialAttempt = null; 
initialAttempt = (Long) existingJobDetail.getJobDataMap().get("firstAttempt"); 

,并执行在类似下面的检查:

long allowedThreshold = initialAttempt + Long.parseLong(trialPeriodOffset); 
     if (System.currentTimeMillis() > allowedThreshold) { 
      //We've tried enough, time to give up 
      log.warn("The job is not going to be rescheduled since it has reached its trial period threshold"); 
      sched.deleteJob(jobName, jobGroup); 
      return YourResultEnumHere.HAS_REACHED_THE_RESCHEDULING_LIMIT; 
     } 

这将是创建正在被返回到您的的核心工作流程的尝试的结果枚举一个好主意像上面的应用程序。

进而构建补赛时间:

Date startTime = null; 
startTime = new Date(System.currentTimeMillis() + Long.parseLong(repeatOffset)); 

String triggerName = "Trigger_" + jobName; 
String triggerGroup = "Trigger_" + jobGroup; 

Trigger retrievedTrigger = sched.getTrigger(triggerName, triggerGroup); 
if (!(retrievedTrigger instanceof SimpleTrigger)) { 
      log.error("While rescheduling the Quartz Job retrieved was not of SimpleTrigger type as expected"); 
      return YourResultEnumHere.ERROR; 
} 

     ((SimpleTrigger) retrievedTrigger).setStartTime(startTime); 
     sched.rescheduleJob(triggerName, triggerGroup, retrievedTrigger); 
     return YourResultEnumHere.RESCHEDULED; 
2

我建议像这样的一个实现恢复失败后的工作:

final JobDataMap jobDataMap = jobCtx.getJobDetail().getJobDataMap(); 
// the keys doesn't exist on first retry 
final int retries = jobDataMap.containsKey(COUNT_MAP_KEY) ? jobDataMap.getIntValue(COUNT_MAP_KEY) : 0; 

// to stop after awhile 
if (retries < MAX_RETRIES) { 
    log.warn("Retry job " + jobCtx.getJobDetail()); 

    // increment the number of retries 
    jobDataMap.put(COUNT_MAP_KEY, retries + 1); 

    final JobDetail job = jobCtx 
     .getJobDetail() 
     .getJobBuilder() 
     // to track the number of retries 
     .withIdentity(jobCtx.getJobDetail().getKey().getName() + " - " + retries, "FailingJobsGroup") 
     .usingJobData(jobDataMap) 
     .build(); 

    final OperableTrigger trigger = (OperableTrigger) TriggerBuilder 
     .newTrigger() 
     .forJob(job) 
     // trying to reduce back pressure, you can use another algorithm 
     .startAt(new Date(jobCtx.getFireTime().getTime() + (retries*100))) 
     .build(); 

    try { 
    // schedule another job to avoid blocking threads 
    jobCtx.getScheduler().scheduleJob(job, trigger); 
    } catch (SchedulerException e) { 
    log.error("Error creating job"); 
    throw new JobExecutionException(e); 
    } 
} 

为什么?

  1. 它不会挡住工人石英
  2. 它将避免背压。随着setRefireIm立即工作将被立即解雇,并可能导致背压问题