google cloud dataflow - Error during the pipeline execution: exceeds allowed maximum skew -
i executed pipeline friday , has been executing during weekend sunday there lots of below error:
14 jun. 2015 14:40:51 (6f550257718f53da): exception: java.lang.illegalargumentexception: timestamp 2015-06-14t12:40:48.731z exceeds allowed maximum skew. com.google.api.client.repackaged.com.google.common.base.preconditions.checkargument(preconditions.java:119) com.google.api.client.util.preconditions.checkargument(preconditions.java:69) com.google.cloud.dataflow.sdk.util.dofnrunner$dofnprocesscontext.checktimestamp(dofnrunner.java:502) com.google.cloud.dataflow.sdk.util.dofnrunner$dofnprocesscontext.outputwithtimestamp(dofnrunner.java:465) com.xtg.hub.dataflow.stats.common.util$extracttimestampfn.processelement(util.java:62) in pipeline there 5 minutes fixedwindow , before doing .apply fixedwindow i'm assigning timestamp each element of pcollection:
c.outputwithtimestamp(c.element(), instant.now()); am doing wrong?
thanks in advance.
it's not current supported in dataflow call outputwithtimestamp timestamp less timestamp of input element. it's possible that, due clock skew between workers, setting timestamp instant.now() trying move timestamp backwards.
edit: example, do:
instant = instant.now(); c.outputwithtimestamp(c.element(), now.isafter(c.timestamp()) ? : c.timestamp());
Comments
Post a Comment