25

Running a Spark SQL (v2.1.0_2.11) program in Java immediately fails with the following exception, as soon as the first action is called on a dataframe:

java.lang.ClassNotFoundException: org.codehaus.commons.compiler.UncheckedCompileException

I ran it in Eclipse, outside of the spark-submit environment. I use the following Spark SQL Maven dependency:

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.11</artifactId>
    <version>2.1.0</version>
    <scope>provided</scope>
</dependency>

10 Answers 10

37

The culprit is the library commons-compiler. Here is the conflict:

enter image description here

To work around this, add the following to your pom.xml:

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.codehaus.janino</groupId>
            <artifactId>commons-compiler</artifactId>
            <version>2.7.8</version>
        </dependency>
    </dependencies>
</dependencyManagement>

4
  • 6
    Maybe somebody stumbles upon... when I upgraded from Spark 2.1 to 2.3. the above workaround stopped to work for me. So I fixed the version of org.codehaus.janino:janino to 3.0.8. That helped. Commented Apr 23, 2018 at 15:24
  • Hi now is 2019-12-17, I still see the error after explicitly adding a dependency on org.codehaus.janino:commons-compiler:3.1.0. My Spark is now at 2.4.3, SparkSQL is at 2.12. I have the full output of gradle dependency here gist.github.com/leeyuiwah-sl/cc9e2f36ebccea0d875a995551abe3ad Can you help! Thanks!
    – leeyuiwah
    Commented Dec 17, 2019 at 10:18
  • 3
    Okay I have found a solution -- I must use Janino 3.0.8 not newer (such as 3.1.0). Also, I must have explicit dependencies on both org.codehaus.janino:janino as well as org.codehaus.janino:commons-compiler:. Thanks!
    – leeyuiwah
    Commented Dec 17, 2019 at 10:44
  • 1
    @leeyuiwah After a long struggling your comment resolved the problem for my case. There was a dependency conflict after adding spring boot to my spark project. Thanks!
    – Hizir
    Commented Oct 19, 2022 at 21:54
33

I had the similar issues, when updated spark-2.2.1 to spark-2.3.0.

In my case, I had to fix commons-compiler and janino

Spark 2.3 solution:

<dependencyManagement>
    <dependencies>
        <!--Spark java.lang.NoClassDefFoundError: org/codehaus/janino/InternalCompilerException-->
        <dependency>
            <groupId>org.codehaus.janino</groupId>
            <artifactId>commons-compiler</artifactId>
            <version>3.0.8</version>
        </dependency>
        <dependency>
            <groupId>org.codehaus.janino</groupId>
            <artifactId>janino</artifactId>
            <version>3.0.8</version>
        </dependency>
    </dependencies>
</dependencyManagement>
<dependencies>
    <dependency>
        <groupId>org.codehaus.janino</groupId>
        <artifactId>commons-compiler</artifactId>
        <version>3.0.8</version>
    </dependency>
    <dependency>
        <groupId>org.codehaus.janino</groupId>
        <artifactId>janino</artifactId>
        <version>3.0.8</version>
    </dependency>
</dependencies>
3
  • Perfect, thank you! I actually got this after adding the spring boot plugin which was conflicting with these Spark dependencies. But this solution fixed that. Commented Aug 20, 2018 at 13:24
  • The accepted solution does not work in Spark 2.3, but this one works perfectly.
    – ScalaBoy
    Commented Nov 29, 2018 at 22:50
  • Adding the above two dependencies with 3.0.16 version fixed it for me. The spark version is 3.1.2
    – Rik
    Commented Nov 29, 2022 at 8:05
12

If you are using the Spark 3.0.1 version or higher, you have to select version 3.0.16 for the two janino dependencies for the @Maksym solution that works very well.


And since Spark 3.4.0, I had to switch from 3.0.16 to 3.1.19 (the latest I've found)

5

My implementation requirement is Spring-boot + Scala + Spark(2.4.5)

For this issue, solution is to exclude artifactID 'janino' and 'commons-compiler' which comes with 'spark-sql_2.12' version 2.4.5.
The reason being the updated version 3.1.2 for both artifactID 'janino' and 'commons-compiler' which comes with 'spark-sql_2.12' version 2.4.5.

After excluding, add version 3.0.8 for both artifactID 'janino' and 'commons-compiler' as separate dependency.

<dependencies>
     <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.12</artifactId>
        <version>2.4.5</version>
        <exclusions>
            <exclusion>
                <artifactId>janino</artifactId>
                <groupId>org.codehaus.janino</groupId>
            </exclusion>
            <exclusion>
                <artifactId>commons-compiler</artifactId>
                <groupId>org.codehaus.janino</groupId>
            </exclusion>
        </exclusions>
    </dependency>
    <dependency>
        <artifactId>janino</artifactId>
        <groupId>org.codehaus.janino</groupId>
        <version>3.0.8</version>
    </dependency>
    <dependency>
        <artifactId>commons-compiler</artifactId>
        <groupId>org.codehaus.janino</groupId>
        <version>3.0.8</version>
    </dependency>
    ...............
    ...............
    ...............
    ...............
    ...............
</dependencies>
5
  • Works for me. But could not understand why it works. Could you explain it please a bit more detailed?
    – Anna Klein
    Commented Jul 19, 2020 at 12:47
  • @AnnaKlein It's all linked to commons-compiler version, which should be align with Scala version. Commented Jul 23, 2020 at 7:35
  • @abhijitcaps How do you know that? Where can you find the correlation between Scala version and commons-compiler version?
    – Shannon
    Commented Feb 18, 2021 at 18:54
  • @abhijitcaps I don't think it has anything to do with Scala version. See issues.apache.org/jira/browse/… for the explanation.
    – Shannon
    Commented Feb 18, 2021 at 20:11
  • @abhijitcaps - if we add the above dependency - will there be any Open Source Software (OSS) issues ?
    – CodeRunner
    Commented Jul 25, 2023 at 11:24
4

this error still arises with org.apache.spark:spark-sql_2.12:2.4.6, but the Janino version have to be used is 3.0.16 With Gradle:

implementation 'org.codehaus.janino:commons-compiler:3.0.16'
implementation 'org.codehaus.janino:janino:3.0.16'
2

in our migration from CDH Parcel 2.2.0.cloudera1 to 2.3.0.cloudera4 we have simply overwritten the maven property :

<janino.version>3.0.8</janino.version>

In addition, we have defined the proper version of the hive dependency in the dependency management part:

<hive.version>1.1.0-cdh5.13.3</hive.version>

    <dependency>
         <groupId>org.apache.hive</groupId>
         <artifactId>hive-jdbc</artifactId>
         <version>${hive.version}</version>
         <scope>runtime</scope>
         <exclusions>
             <exclusion>
                 <groupId>org.eclipse.jetty.aggregate</groupId>
                 <artifactId>*</artifactId>
             </exclusion>
             <exclusion>
                 <artifactId>slf4j-log4j12</artifactId>
                 <groupId>org.slf4j</groupId>
             </exclusion>
             <exclusion>
                 <artifactId>parquet-hadoop-bundle</artifactId>
                 <groupId>com.twitter</groupId>
             </exclusion>
         </exclusions>
     </dependency>

The exclusions were necessary for the previous version, they might not be necessary anymore

1

Apache spark-sql brings the required versions of janino and commons-compiler. If you're encountering this error, something else in your pom (or parent pom) is over-riding the version. While you can explicitly set the janino and commons-compiler versions in your pom to match what spark brings as suggested in other answers this will make long-term maintenance more difficult because maintainers will need to remember to update these explicit versions each-time you update spark. Instead, I recommend what worked well for me:

Figure out what is bringing in the wrong version of janino by running:

mvn dependency:tree #-Dverbose may be helpful

Exclude janino and commons-compiler from the offending dependency. In my case it was an in-house hadoop testing framework:

        <dependency>
            <groupId>org.my.client.pkg</groupId>
            <artifactId>hadoop-testing-framework</artifactId>
            <version>${some.version}</version>
            <exclusions>
                <!-- We want only and exactly Spark's janino version -->
                <exclusion>
                    <groupId>org.codehaus.janino</groupId>
                    <artifactId>janino</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.codehaus.janino</groupId>
                    <artifactId>commons-compiler</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

Re-run mvn dependency:tree and repeat the above process for any other dependencies which are overriding spark's janino version until your versions of janino and commons-compiler are coming from your spark-sql dependency as shown in my (abbreviated) mvn dependency:tree output below:

[INFO] +- org.apache.spark:spark-sql_2.11:jar:2.4.0.cloudera1:provided
[INFO] |  +- org.apache.spark:spark-catalyst_2.11:jar:2.4.0.cloudera1:provided
[INFO] |  |  +- org.codehaus.janino:janino:jar:3.0.9:compile
[INFO] |  |  +- org.codehaus.janino:commons-compiler:jar:3.0.9:compile

Note, if you see something like:

[INFO] +- org.apache.spark:spark-sql_2.11:jar:2.4.0.cloudera1:provided
[INFO] |  +- org.apache.spark:spark-catalyst_2.11:jar:2.4.0.cloudera1:provided
[INFO] |  |  +- org.codehaus.janino:janino:jar:2.6.1:compile  <-- Note old version
[INFO] |  |  +- org.codehaus.janino:commons-compiler:jar:3.0.9:compile

then someone else is still over-riding spark's janino version. In my case, the parent pom was explicitly bringing in v2.6.1. Removing that dependency block from the parent pom solved my problem. This is where the -Dverbose flag may help.

Final note, at least my version of spark could not tolerate any change in the janino or commons-compiler versions. It had to be exactly what spark brought with it down to the patch (assuming codehaus follows semver).

1

I had to downgrade the janino version provided by spring-boot 2.7.3 and manually include:

<dependency>
  <groupId>org.codehaus.janino</groupId>
  <artifactId>janino</artifactId>
  <version>3.0.16</version>
</dependency>
0

Selected answer didn't work for me, this did:

<dependency>
      <groupId>org.codehaus.janino</groupId>
      <artifactId>janino</artifactId>
      <version>3.0.8</version>
    </dependency>
0

I am using spark 3.2.1 It does not have such issue, you can upgrade it if possible

"org.apache.spark" %% "spark-core" % "3.2.1",
"org.apache.spark" %% "spark-sql" % "3.2.1",

Not the answer you're looking for? Browse other questions tagged or ask your own question.