cassandra - Spark Master does not start with DSE 4.7 and OpsCenter 5.1.3 -
i upgraded datastax 4.6.3 => 4.7, , having trouble running spark. problem seems spark master not configured properly. use opscenter 5.1.3, , started 3 node analytics cluster. strangely, nodes has setting spark_enabled=0, , had set 1 manually. now, however, spark master not configured properly. in /var/log/cassandra/system.log, long output of:
[spark-worker-init-0] 2015-06-13 21:59:54,027 sparkworkerrunner.java:49 - spark master not ready @ (no configured master) info [spark-worker-init-0] 2015-06-13 21:59:55,028 sparkworkerrunner.java:49 - spark master not ready @ (no configured master) info [spark-worker-init-0] 2015-06-13 21:59:56,028 sparkworkerrunner.java:49 - spark master not ready @ (no configured master)
i try run dse spark, , following error:
java.io.ioexception: spark master address cannot retrieved. should not happening dse 4.7+ unless cluster on 50% down or booted in last minute. @ com.datastax.bdp.plugin.sparkplugin.getmasteraddress(sparkplugin.java:257) @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:57) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:606) @ sun.reflect.misc.trampoline.invoke(methodutil.java:75) @ sun.reflect.generatedmethodaccessor8.invoke(unknown source) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:606) @ sun.reflect.misc.methodutil.invoke(methodutil.java:279) @ com.sun.jmx.mbeanserver.standardmbe
my analytics dc has been few days, , there no nodes booting. issue has been blocking development last few days, , considering downgrading dse 4.6.3, can run spark jobs again. whatsoever appreciated.
update:
i looking condition 50% of analytics nodes required spark master start. after examining system.log on dse startup, i'm noticing gossip still seems think old nodes part of cluster, , down. instance,
info [gossipstage:1] 2015-06-14 03:18:05,587 gossiper.java:968 - inetaddress /172.31.23.17 down info [gossipstage:1] 2015-06-14 03:18:05,614 gossiper.java:968 - inetaddress /172.31.16.58 down info [gossipstage:1] 2015-06-14 03:18:05,647 gossiper.java:968 - inetaddress /172.31.24.25 down info [gossipstage:1] 2015-06-14 03:18:05,687 gossiper.java:968 - inetaddress /172.31.24.147 down
these nodes took offline earlier. have purged system.peers table of these nodes, gossip still seems acknowledge them part of cluster. phantom presence of these nodes push cluster past 50% down. purging gossip tables requires full cluster shutdown.
Comments
Post a Comment