Is there a limit on the number of side outputs in Google Cloud Dataflow? -


we have cloud dataflow job takes in bigquery table, transforms , writes each record out different table depending on month/year in timestamp record. when run our job on table 12 months of data there should 12 output tables. first month main output , other 11 months side outputs.

we have found job fail when run on 10 or more months(9 side outputs).

is limit on cloud dataflow or bug?

i noticed in execution graph when running more 8 side outputs of outputs said "running" didn't seem writing records.

here of our job ids:

2015-06-14_23_58_06-14457541029573485807 (8 side outputs - passed)

2015-06-14_23_48_43-15277609445992188388 (9 side outputs - failed)

2015-06-14_23_11_46-10500077558949649888 (7 side outputs - passed)

2015-06-14_22_38_48-1428211312699949403 (3 side outputs - passed)

2015-06-14_21_44_27-16273252623089185131 (11 side outputs - failed)

this code processes data. there no caching involved. (tressoutputmanager holds cache of tupletag<tablerow>)

public class tressdenormalizationdofn extends dofn<tablerow, tablerow> {     @inject     @named("tress.mappers")     private set<cptmapper> mappers;     @inject     private tressoutputmanager tuples;      @override     public void processelement(processcontext c) throws exception {         tablerow row = c.element().clone();         (cptmapper mapper : mappers) {             string mapped = mapper.map((string) row.get("event"));             if (mapped != null) {                 row.set(mapper.getid(), mapped);             }         }         // places record in correct month based on time stamp         string timestamp = (string) row.get("time_local");         if(timestamp != null){             timestamp = timestamp.substring(0, 7).replaceall("-", "_");              if (tuples.ismainoutput(timestamp)) {                 c.output(row);             } else {                 c.sideoutput(tuples.gettuple(timestamp), row);             }         }     } } 


Comments

Popular posts from this blog

javascript - Google App Script ContentService downloadAsFile not working -

javascript - Function overwritting -

php - Find a regex to take part of Email -