[BUG] Spark Thrift Server fails to start with Hive 4.0.1 — multiple packaging issues
Environment
- Clemlab version: 1.3.1.0-292
- OS: Linux 5.14.0-427.13.1.el9_4.x86_64 (RHEL 9.4)
- Java: 1.8.0_391
- Spark: 3.5.6.1.3.1.0-292
- Hive: 4.0.1.1.3.1.0-292
- Hadoop: 3.4.1
Summary
The Spark 3 Thrift Server (HiveThriftServer2) fails to start out-of-the-box in Clemlab 1.3.1.0-292. Three distinct packaging bugs were identified:
- Missing
standalone-metastore directory — the directory referenced by spark.sql.hive.metastore.jars does not exist
- Reflection incompatibility with Hive 4.0 —
ReflectionUtils$.setSuperField assumes Hive 2.x/3.x class hierarchy
- Missing repackaged Guava classes —
org.sparkproject.guava.collect.MultimapBuilder not found
Bug 1: Missing standalone-metastore directory
Symptom
WARN HiveUtils: Hive jar path '/usr/odp/current/spark3-client/standalone-metastore/*' does not exist.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/metastore/api/AlreadyExistsException
Root cause
The default configuration sets spark.sql.hive.metastore.jars = /usr/odp/current/spark3-client/standalone-metastore/* and spark.sql.hive.metastore.version = 3.0, but the directory /usr/odp/current/spark3-client/standalone-metastore/ is never created during installation.
Analysis
The IsolatedClientLoader uses a dedicated classloader with JARs from the standalone-metastore directory. Populating this directory with all Hive JARs from /usr/odp/1.3.1.0-292/hive/lib/ introduces a LinkageError:
java.lang.LinkageError: loader constraint violation: loader (instance of sun/misc/Launcher$AppClassLoader)
previously initiated loading for a different type with name "org/apache/hadoop/hive/common/io/SessionStream"
This occurs because SessionStream exists in both:
hive-common-4.0.1.1.3.1.0-292.jar in the main Spark classpath (/usr/odp/current/spark3-thriftserver/jars/)
hive-exec-4.0.1.1.3.1.0-292.jar (uber-jar) in the standalone-metastore directory
The hive-exec JAR shipped with Hive is a fat/uber JAR that embeds classes from hive-common (including SessionStream), while the Spark Thrift Server classpath ships hive-exec-4.0.1.1.3.1.0-292-core.jar (a stripped version without embedded dependencies). When both classloaders have copies of SessionStream, the JVM throws a LinkageError when they cross classloader boundaries in SparkSQLEnv$.init().
Workaround attempted
Using builtin mode (spark.sql.hive.metastore.jars = builtin, spark.sql.hive.metastore.version = 4.0.1.1.3.1.0-292) bypasses the standalone-metastore directory entirely and avoids the classloader isolation issues. However, this leads to Bug 2.
Suggested fix
Either:
- Ship a correctly populated
standalone-metastore directory using the -core variant of hive-exec plus all necessary dependencies (thrift, fb303, hive-common, hive-metastore, hive-shims, hive-shims-0.23, etc.) without including classes that conflict with the main Spark classpath
- Or set the default configuration to
spark.sql.hive.metastore.jars = builtin and spark.sql.hive.metastore.version = 4.0.1.1.3.1.0-292, which requires fixing Bug 2
Bug 2: ReflectionUtils$.setSuperField incompatible with Hive 4.0 class hierarchy
Symptom
With builtin mode:
ERROR HiveThriftServer2: Error starting HiveThriftServer2
java.lang.NoSuchFieldException: hiveConf
at java.lang.Class.getDeclaredField(Class.java:2070)
at org.apache.spark.sql.hive.thriftserver.ReflectionUtils$.setAncestorField(ReflectionUtils.scala:27)
at org.apache.spark.sql.hive.thriftserver.ReflectionUtils$.setSuperField(ReflectionUtils.scala:22)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIService.init(SparkSQLCLIService.scala:48)
After fixing hiveConf, a second reflection error appears:
java.lang.NoSuchFieldException: cliService
at org.apache.spark.sql.hive.thriftserver.ReflectionUtils$.setAncestorField(ReflectionUtils.scala:27)
at org.apache.spark.sql.hive.thriftserver.ReflectionUtils$.setSuperField(ReflectionUtils.scala:22)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.init(HiveThriftServer2.scala:133)
Root cause
ReflectionUtils$.setSuperField(obj, fieldName, value) calls setAncestorField(obj, 1, fieldName, value) with a hardcoded level of 1, meaning it always looks for the field in the immediate superclass only.
In Hive 4.0, the class hierarchy changed. The hiveConf field, which was previously in CLIService (1 level up from SparkSQLCLIService), was moved to AbstractService (3 levels up):
SparkSQLCLIService → CLIService → CompositeService → AbstractService (has hiveConf)
Meanwhile, cliService is in HiveServer2 (1 level up from HiveThriftServer2), so it needs level 1:
HiveThriftServer2 → HiveServer2 (has cliService) → CompositeService → AbstractService
The hardcoded drop(1) in setSuperField cannot accommodate both depths. Additionally, the method getAncestorField is also called by ReflectedCompositeService and has the same issue.
Workaround applied
Recompiled ReflectionUtils.scala to search the entire class hierarchy iteratively instead of jumping to a fixed level:
package org.apache.spark.sql.hive.thriftserver
object ReflectionUtils {
def setSuperField(obj: AnyRef, fieldName: String, fieldValue: AnyRef): Unit = {
setFieldInHierarchy(obj, fieldName, fieldValue)
}
def setAncestorField(obj: AnyRef, level: Int, fieldName: String, fieldValue: AnyRef): Unit = {
setFieldInHierarchy(obj, fieldName, fieldValue)
}
def getSuperField[T](obj: AnyRef, fieldName: String): T = {
getFieldInHierarchy[T](obj, fieldName)
}
def getAncestorField[T](obj: AnyRef, level: Int, fieldName: String): T = {
getFieldInHierarchy[T](obj, fieldName)
}
private def setFieldInHierarchy(obj: AnyRef, fieldName: String, fieldValue: AnyRef): Unit = {
var clazz: Class[_] = obj.getClass
while (clazz != null) {
try {
val f = clazz.getDeclaredField(fieldName)
f.setAccessible(true)
f.set(obj, fieldValue)
return
} catch {
case _: NoSuchFieldException => clazz = clazz.getSuperclass
}
}
throw new NoSuchFieldException(fieldName)
}
private def getFieldInHierarchy[T](obj: AnyRef, fieldName: String): T = {
var clazz: Class[_] = obj.getClass
while (clazz != null) {
try {
val f = clazz.getDeclaredField(fieldName)
f.setAccessible(true)
return f.get(obj).asInstanceOf[T]
} catch {
case _: NoSuchFieldException => clazz = clazz.getSuperclass
}
}
throw new NoSuchFieldException(fieldName)
}
}
Compiled with:
java -cp /usr/odp/1.3.1.0-292/spark3/jars/scala-compiler-2.12.18.jar:\
/usr/odp/1.3.1.0-292/spark3/jars/scala-library-2.12.18.jar:\
/usr/odp/1.3.1.0-292/spark3/jars/scala-reflect-2.12.18.jar \
scala.tools.nsc.Main \
-classpath /usr/odp/1.3.1.0-292/spark3/jars/scala-library-2.12.18.jar \
-d /tmp/fix2 /tmp/fix2/ReflectionUtils.scala
cd /tmp/fix2
jar uf /usr/odp/current/spark3-thriftserver/jars/spark-hive-thriftserver_2.12-3.5.6.1.3.1.0-292.jar \
org/apache/spark/sql/hive/thriftserver/ReflectionUtils$.class \
org/apache/spark/sql/hive/thriftserver/ReflectionUtils.class
Suggested fix
Update ReflectionUtils in spark-hive-thriftserver to search the entire class hierarchy instead of using a hardcoded level. The iterative approach shown above is backward-compatible with Hive 2.x/3.x since it will still find the field regardless of which superclass contains it.
Bug 3: Missing org.sparkproject.guava.collect.MultimapBuilder
Symptom
After fixing Bug 2, the Thrift Server fails with:
java.lang.NoClassDefFoundError: org/sparkproject/guava/collect/MultimapBuilder
at org.apache.hive.service.cli.operation.OperationManager.<init>(OperationManager.java:83)
at org.apache.hive.service.cli.session.SessionManager.<init>(SessionManager.java:77)
at org.apache.spark.sql.hive.thriftserver.SparkSQLSessionManager.<init>(SparkSQLSessionManager.scala:36)
Root cause
The spark-hive-thriftserver_2.12-3.5.6.1.3.1.0-292.jar embeds Hive 4 service classes (OperationManager, SessionManager, etc.) that reference Guava under the repackaged namespace org.sparkproject.guava. However, the repackaged Guava classes shipped in spark-network-common_2.12-3.5.6.1.3.1.0-292.jar do not include MultimapBuilder.
The standard Guava JAR (guava-32.0.1-jre.jar) contains MultimapBuilder under com.google.common.collect, but the embedded Hive classes expect it under org.sparkproject.guava.collect.
Suggested fix
Either:
- Include a complete repackaged Guava JAR (
org.sparkproject.guava) that covers all classes needed by the embedded Hive 4 service classes
- Or recompile the embedded Hive service classes to use the standard Guava namespace (
com.google.common) which is already present in the classpath
Impact
The Spark Thrift Server is completely non-functional in Clemlab 1.3.1.0-292. No workaround was found that resolves all three issues simultaneously.
Services NOT affected: Hive (standalone), Spark shell (spark3-shell), PySpark (pyspark3), Spark SQL (spark3-sql), spark3-submit, and all other cluster services work correctly. Only the Spark Thrift Server (HiveThriftServer2) is broken.
Steps to reproduce
- Install Clemlab 1.3.1.0-292 with Spark 3 and Hive
- Start the Spark Thrift Server from Ambari
- Check logs:
tail -f /var/log/spark3/spark-spark-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-*.out
The service will fail with one of the errors described above depending on configuration.
[BUG] Spark Thrift Server fails to start with Hive 4.0.1 — multiple packaging issues
Environment
Summary
The Spark 3 Thrift Server (HiveThriftServer2) fails to start out-of-the-box in Clemlab 1.3.1.0-292. Three distinct packaging bugs were identified:
standalone-metastoredirectory — the directory referenced byspark.sql.hive.metastore.jarsdoes not existReflectionUtils$.setSuperFieldassumes Hive 2.x/3.x class hierarchyorg.sparkproject.guava.collect.MultimapBuildernot foundBug 1: Missing
standalone-metastoredirectorySymptom
Root cause
The default configuration sets
spark.sql.hive.metastore.jars = /usr/odp/current/spark3-client/standalone-metastore/*andspark.sql.hive.metastore.version = 3.0, but the directory/usr/odp/current/spark3-client/standalone-metastore/is never created during installation.Analysis
The
IsolatedClientLoaderuses a dedicated classloader with JARs from thestandalone-metastoredirectory. Populating this directory with all Hive JARs from/usr/odp/1.3.1.0-292/hive/lib/introduces aLinkageError:This occurs because
SessionStreamexists in both:hive-common-4.0.1.1.3.1.0-292.jarin the main Spark classpath (/usr/odp/current/spark3-thriftserver/jars/)hive-exec-4.0.1.1.3.1.0-292.jar(uber-jar) in thestandalone-metastoredirectoryThe
hive-execJAR shipped with Hive is a fat/uber JAR that embeds classes fromhive-common(includingSessionStream), while the Spark Thrift Server classpath shipshive-exec-4.0.1.1.3.1.0-292-core.jar(a stripped version without embedded dependencies). When both classloaders have copies ofSessionStream, the JVM throws aLinkageErrorwhen they cross classloader boundaries inSparkSQLEnv$.init().Workaround attempted
Using
builtinmode (spark.sql.hive.metastore.jars = builtin,spark.sql.hive.metastore.version = 4.0.1.1.3.1.0-292) bypasses thestandalone-metastoredirectory entirely and avoids the classloader isolation issues. However, this leads to Bug 2.Suggested fix
Either:
standalone-metastoredirectory using the-corevariant ofhive-execplus all necessary dependencies (thrift, fb303, hive-common, hive-metastore, hive-shims, hive-shims-0.23, etc.) without including classes that conflict with the main Spark classpathspark.sql.hive.metastore.jars = builtinandspark.sql.hive.metastore.version = 4.0.1.1.3.1.0-292, which requires fixing Bug 2Bug 2:
ReflectionUtils$.setSuperFieldincompatible with Hive 4.0 class hierarchySymptom
With
builtinmode:After fixing
hiveConf, a second reflection error appears:Root cause
ReflectionUtils$.setSuperField(obj, fieldName, value)callssetAncestorField(obj, 1, fieldName, value)with a hardcoded level of1, meaning it always looks for the field in the immediate superclass only.In Hive 4.0, the class hierarchy changed. The
hiveConffield, which was previously inCLIService(1 level up fromSparkSQLCLIService), was moved toAbstractService(3 levels up):Meanwhile,
cliServiceis inHiveServer2(1 level up fromHiveThriftServer2), so it needs level 1:The hardcoded
drop(1)insetSuperFieldcannot accommodate both depths. Additionally, the methodgetAncestorFieldis also called byReflectedCompositeServiceand has the same issue.Workaround applied
Recompiled
ReflectionUtils.scalato search the entire class hierarchy iteratively instead of jumping to a fixed level:Compiled with:
java -cp /usr/odp/1.3.1.0-292/spark3/jars/scala-compiler-2.12.18.jar:\ /usr/odp/1.3.1.0-292/spark3/jars/scala-library-2.12.18.jar:\ /usr/odp/1.3.1.0-292/spark3/jars/scala-reflect-2.12.18.jar \ scala.tools.nsc.Main \ -classpath /usr/odp/1.3.1.0-292/spark3/jars/scala-library-2.12.18.jar \ -d /tmp/fix2 /tmp/fix2/ReflectionUtils.scala cd /tmp/fix2 jar uf /usr/odp/current/spark3-thriftserver/jars/spark-hive-thriftserver_2.12-3.5.6.1.3.1.0-292.jar \ org/apache/spark/sql/hive/thriftserver/ReflectionUtils$.class \ org/apache/spark/sql/hive/thriftserver/ReflectionUtils.classSuggested fix
Update
ReflectionUtilsinspark-hive-thriftserverto search the entire class hierarchy instead of using a hardcoded level. The iterative approach shown above is backward-compatible with Hive 2.x/3.x since it will still find the field regardless of which superclass contains it.Bug 3: Missing
org.sparkproject.guava.collect.MultimapBuilderSymptom
After fixing Bug 2, the Thrift Server fails with:
Root cause
The
spark-hive-thriftserver_2.12-3.5.6.1.3.1.0-292.jarembeds Hive 4 service classes (OperationManager,SessionManager, etc.) that reference Guava under the repackaged namespaceorg.sparkproject.guava. However, the repackaged Guava classes shipped inspark-network-common_2.12-3.5.6.1.3.1.0-292.jardo not includeMultimapBuilder.The standard Guava JAR (
guava-32.0.1-jre.jar) containsMultimapBuilderundercom.google.common.collect, but the embedded Hive classes expect it underorg.sparkproject.guava.collect.Suggested fix
Either:
org.sparkproject.guava) that covers all classes needed by the embedded Hive 4 service classescom.google.common) which is already present in the classpathImpact
The Spark Thrift Server is completely non-functional in Clemlab 1.3.1.0-292. No workaround was found that resolves all three issues simultaneously.
Services NOT affected: Hive (standalone), Spark shell (
spark3-shell), PySpark (pyspark3), Spark SQL (spark3-sql),spark3-submit, and all other cluster services work correctly. Only the Spark Thrift Server (HiveThriftServer2) is broken.Steps to reproduce
tail -f /var/log/spark3/spark-spark-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-*.outThe service will fail with one of the errors described above depending on configuration.