前沿拓展:
不兼容的应用程序
以电脑为例,不兼容的应用程序解决的方法:
1、第一,拿任意软件做例子,右键进入其软件的属性。
2、将弹出的属性窗口切换至兼容性面板。
2009年,计算机用户数量从原来的630万增长至6710万台,联网计算机台数由原来的5940序义联长字父转汽权万台上升至2.9亿台。互联网用户已经达到3.16才当提重亿,**互联网有6.7亿移动用户,其中手机上网用户达1.17亿,为全球第一位。
用户反馈自己写的spark程序放到YANR执行偶尔报错,错误提示java.lang.IllegalArgumentException: Illegal pattern component: XXX。从日志来看是创建FastDateFormat对象报错,并且偶尔报错,难道和执行的数据有关,用户反馈相同数据也是有时成功有时失败;报错都发生在集群固定节点吗,这个也不明确,先仔细看看错误日志再说。
使用vim查看spark日志,也可使用yarn logs -applicationId $appId查看日志
23/02/08 10:15:06 ERROR Executor: Exception in task 5.3 in stage 0.0 (TID 4)
java.lang.IllegalArgumentException: Illegal pattern component: XXX
at org.apache.commons.lang3.time.FastDatePrinter.parsePattern(FastDatePrinter.java:282)
at org.apache.commons.lang3.time.FastDatePrinter.init(FastDatePrinter.java:149)
at org.apache.commons.lang3.time.FastDatePrinter.<init>(FastDatePrinter.java:142)
at org.apache.commons.lang3.time.FastDateFormat.<init>(FastDateFormat.java:384)
at org.apache.commons.lang3.time.FastDateFormat.<init>(FastDateFormat.java:369)
at org.apache.commons.lang3.time.FastDateFormat$1.createInstance(FastDateFormat.java:91)
at org.apache.commons.lang3.time.FastDateFormat$1.createInstance(FastDateFormat.java:88)
at org.apache.commons.lang3.time.FormatCache.getInstance(FormatCache.java:82)
at org.apache.commons.lang3.time.FastDateFormat.getInstance(FastDateFormat.java:165)
at org.apache.spark.sql.catalyst.json.JSONOptions.<init>(JSONOptions.scala:83)
at org.apache.spark.sql.catalyst.json.JSONOptions.<init>(JSONOptions.scala:43)
at org.apache.spark.sql.Dataset$anonfun$toJSON$1.apply(Dataset.scala:3146)
at org.apache.spark.sql.Dataset$anonfun$toJSON$1.apply(Dataset.scala:3142)
at org.apache.spark.sql.execution.MapPartitionsExec$anonfun$5.apply(objects.scala:188)
at org.apache.spark.sql.execution.MapPartitionsExec$anonfun$5.apply(objects.scala:185)
at org.apache.spark.rdd.RDD$anonfun$mapPartitionsInternal$1$anonfun$apply$25.apply(RDD.scala:836)
at org.apache.spark.rdd.RDD$anonfun$mapPartitionsInternal$1$anonfun$apply$25.apply(RDD.scala:836)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
根据日志错误是从spark catalyst出来的,即org.apache.spark.sql.catalyst.json.JSONOptions类的83行,创建FastDateFormat实例的过程发生异常。查看spark JSONOptions类源码,我们发现报错行入参处恰好有字符XXX,是否和这儿有关继续看创建创建FastDateFormat过程,跟踪源码到commons-lang3的org.apache.commons.lang3.time.FastDatePrinter类
//Spark源码
val timestampFormat: FastDateFormat =
FastDateFormat.getInstance(
parameters.getOrElse("timestampFormat", "yyyy-MM-dd'T'HH:mm:ss.SSSXXX"), timeZone, Locale.US)
找到spark源码引用的commons-lang3-3.8.1.jar包,进入在FastDatePrinter类搜索关键词Illegal pattern component,关键词Illegal pattern component找到了,仅此一处,但仔细一看行号对不上,说明spark加载的FastDateFormat并不是3.8.1版本,那从哪儿来的呢
protected List<Rule> parsePattern() {
//。。。省略
case 'M': // month in year (text and number)
if (tokenLen >= 4) {
rule = new TextField(Calendar.MONTH, months);
} else if (tokenLen == 3) {
rule = new TextField(Calendar.MONTH, shortMonths);
} else if (tokenLen == 2) {
rule = TwoDigitMonthField.INSTANCE;
} else {
rule = UnpaddedMonthField.INSTANCE;
}
break;
case 'd': // day in month (number)
rule = selectNumberRule(Calendar.DAY_OF_MONTH, tokenLen);
break;
//。。。省略
default:
throw new IllegalArgumentException("Illegal pattern component: " + token);
}
rules.add(rule);
}
return rules;
}
第一看用户代码是否引入了commons-lang3,解压fat-jar后确实存在org.apache.commons.lang3.time.FastDateFormat,但版本也是3.8.1,不会这儿引起的。查找spark安装目录,执行命令find . -name "commons-lang3*"进行查找,只有一个commons-lang3-3.8.1.jar在${SPARK_HOME}/jars/目录下,再查找hadoop目录,发现了有commons-lang3,但下载下来反编译后,检查行号仍然对不上,说明也不是加载的这个jar,奇怪为啥没找到哪儿来的类,想尝试用arthas查看但是不确定executor会启动在哪个节点,且没多久就报错结束了。
尝试添加JVM参数打印spark driver/executor加载的类,从日志看spark driver加载了正确的包commons-lang3-3.8.1.jar,spark executor没找到想要日志,可能verbose输出的时候还没用到FastDateFormat类?下面是spark-submit添加JVM参数的方式:
spark-submit \
–master yarn \
–driver-memory 4G \
–name 'AppName' \
–conf 'spark.driver.extraJavaOptions=-verbose:class' \
–conf 'spark.executor.extraJavaOptions=-verbose:class' \
用户催的急没时间了,改代码看看FastDateFormat哪儿来的,顺便在发生异常的情况下重新创建FastDateFormat对象,去掉参数XXX,这样避免因为不支持XXX参数而导致程序失败退出。
// 修改org.apache.spark.sql.catalyst.json.JSONOptions源码,捕获timestampFormat实例化的异常
// 捕获异常,打印Message和类加载来源
logWarning("==============>>>" + e.getMessage)
val clazz = FastDateFormat.getInstance().getClass
val location = clazz.getResource('/' + clazz.getName.replace('.', '/') + ".class")
logWarning("resource location: " + location.toString)
打包编译并替换spark对应jar包,运行成功了且捕获到了异常信息,发现FastDateFormat来源hive-exec-1.2.1.spark2.jar,才想起忘了搜索jar包里的class类,除了包名commons-lang3*,还需要查找jar里的内容,hive-exec刚好是个fat-jar
23/02/08 17:12:39 WARN JSONOptions: ==============>>>Illegal pattern component: XXX
23/02/08 17:12:39 WARN JSONOptions: resource location: jar:file:/data/hadoop/yarn/local/usercache/…/__spark_libs__1238265929018908261.zip/hive-exec-1.2.1.spark2.jar!/org/apache/commons/lang3/time/FastDateFormat.class
spark的commons-lang3-3.8.1.jar和hive-exec-1.2.1.spark2.jar都在目录${SPARK_HOME}/jar/下,百度了下这种情况下类加载顺序和CLASSPATH填写的JAR顺序以及创建时间都相关(没找到一个权威的文章),期间尝试过设置spark-submit参数:–conf 'spark.executor.extraClassPath=commons-lang3-3.8.1.jar'来优先加载这个jar,仍然执行报错,后来看日志的时候偶然发现 CLASSPATH的路径时错误的,如下日志,看来spark.executor.extraClassPath的作用就是把jar包放到classpath的最前面,这样达到优先加载的目的。
export SPARK_YARN_STAGING_DIR="hdfs://nnHA/user/p55_u34_tsp_caihong/.sparkStaging/application_1670726876109_157924"
export CLASSPATH="commons-lang3-3.8.1.jar:$PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:/etc/hadoop/conf:/usr/lib/hadoop/lib/*:/usr/lib/hadoop/.//*:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/*:/usr/lib/hadoop-hdfs/.//*:/usr/lib/hadoop-yarn/lib/*:/usr/lib/hadoop-yarn/.//*:/usr/lib/hadoop-mapreduce/lib/*:/usr/lib/hadoop-mapreduce/.//*:/etc/hadoop/conf:/usr/lib/hadoop/libs/*:/usr/lib/hadoop/.//*:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/*:/usr/lib/hadoop-hdfs/.//*:/usr/lib/hadoop-yarn/lib/*:/usr/lib/hadoop-yarn/.//*:/usr/lib/hadoop-mapreduce/lib/*:/usr/lib/hadoop-mapreduce/.//*:$PWD/__spark_conf__/__hadoop_conf__"
spark.executor.extraClassPath必须使用全路径,尝试修改spark-submit命令,写全spark.executor.extraClassPath路径,重新执行问题解决,没在报错。但是仍有疑问:为啥偶尔成功失败,按理说两个jar包虽然都在jars目录,也会有固定的加载顺序吧,不解。。。了解的同学还请在评论区告之,不胜感激。修改源码和设置spark.executor.extraClassPath参数都能有效解决这个问题。
spark-submit \
–master yarn \
–driver-memory 4G \
–name 'AppName' \
–conf 'spark.driver.extraJavaOptions=-verbose:class' \
–conf 'spark.executor.extraJavaOptions=-verbose:class' \
–conf 'spark.executor.extraClassPath=${SPARK_HOME}/jars/commons-lang3-3.8.1.jar' \
拓展知识:
原创文章,作者:九贤生活小编,如若转载,请注明出处:http://www.wangguangwei.com/43306.html