Spark installed on WSL.
Java installed on WSL.
Jupyter ran on WSL:
Code ran in Windows browser:
spark = (SparkSession.builder
.master('spark://ZZHPC.localdomain:7077')
.appName('Read Data from PostgreSQLL')
.getOrCreate())
25/11/22 20:06:34 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable TypeError: 'JavaPackage' object is not callable
Root causes remaining:
1️⃣ Spark master unreachable (most common)
You're using:
WSL hostname resolution is often wrong.
Run in WSL:
If it resolves to 127.0.1.1 → broken.
Then try:
you must connect with:
Hostname mismatch → Java gateway fails → JavaPackage error.
2️⃣ No worker registered
Open Spark master UI:
(or use the IP version)
Check:
Start worker:
Or using IP:
3️⃣ Spark version mismatch (common in WSL setups)
Run:
and
If versions differ → classloader breaks → JavaPackage errors.
4️⃣ Python kernel in Jupyter is NOT your WSL Python
This happens if you selected a Windows Python kernel accidentally.
Inside a Jupyter cell, run:
You should see:
If you see:
→ you are using a Windows kernel (wrong!) even though Jupyter was started in WSL.
This will DEFINITELY produce the 'JavaPackage' error.
Fix:
Select the correct kernel:
🎯 Let’s run 3 quick diagnostics to identify the real cause
Please run these inside your notebook:
Test 1: Check OS of the Python kernel
Test 2: Check if PySpark can find the Java gateway
Test 3: Check Spark master connectivity
In WSL terminal:
Then try:
📌 Summary
Since Jupyter server is in WSL,
you are very close to a working setup.
The remaining likely causes are:
-
Wrong Python kernel (Windows kernel selected)
-
Wrong master URL (hostname resolves badly)
-
Worker not running
-
Spark version mismatch
Send me the results of the 3 tests,
and I’ll pinpoint the exact cause in one message.
'ping ZZHPC.localdomain' and 'ping ZZHPC' both succeeded, output of 'hostname -I' is 172.26.76.164 connect with '.master("spark://10.255.255.254:7077")' also failed.
ChatGPT said:
Thanks — your results tell me something very important:
-
Your WSL machine IP is 172.26.76.164
-
Spark WARN message earlier showed Spark was using 10.255.255.254
-
Your hostname ZZHPC.localdomain resolves correctly (so DNS works)
-
But the cluster IP (10.255.255.254) does not match your WSL IP (172.26.x.x)
This means:
🚨 Your Spark master is binding to the wrong network interface.
This is the real root cause.
And when PySpark tries to talk to that wrong address, SparkContext fails to initialize → leading to:
because the JVM gateway never finishes loading.
Why Spark is using the wrong IP (10.255.x.x)
Your earlier Spark logs said:
This is Spark guessing a fallback interface.
But on WSL2,
the correct interface is almost always eth0 with the IP:
So your Spark master should be:
But Spark is currently binding to:
which is an internal interface that PySpark cannot reach.
✅ Solution: Force Spark to bind to the correct IP
Edit (or create) this file:
Save it.
Then restart Spark:
Verify:
You should now see the master UI showing Workers = 1.
Then connect PySpark like this:
Now SparkSession should start normally.
🎯 Summary
Your Spark master was binding to a wrong/invalid interface (10.255.x.x).
Fix by adding:
Then restart Spark and workers.
After this, the Java gateway will start correctly and the 'JavaPackage' error disappears.
If you want, show me the output of:
and I can confirm the exact interface Spark should bind to.
You said:
I did all the steps, but still got the TypeError: 'JavaPackage' object is not callable
ifconfig output is as below:
frank@ZZHPC:/opt/spark-3.5.7-bin-hadoop3/conf$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.26.76.164 netmask 255.255.240.0 broadcast 172.26.79.255
inet6 fe80::215:5dff:fe90:982d prefixlen 64 scopeid 0x20<link>
ether 00:15:5d:90:98:2d txqueuelen 1000 (Ethernet)
RX packets 65969 bytes 58202948 (58.2 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 63637 bytes 4665856 (4.6 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 8876 bytes 7533071 (7.5 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 8876 bytes 7533071 (7.5 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
ChatGPT said:
Thanks — the ifconfig output shows that your correct network interface is eth0 with IP 172.26.76.164, and lo is just the loopback.
Since you already set:
and restarted the master/worker, PySpark should now be able to connect.
The fact that you still get:
tells me that the problem is not network anymore, but something else in your setup is broken, most likely:
1️⃣ Windows vs WSL Python kernel
Even though Jupyter is started in WSL, you must confirm the kernel is WSL Python, not a Windows Python kernel.
In a notebook cell, run:
If it says 'Windows', PySpark will fail with 'JavaPackage' object is not callable.
If it is Windows, change the kernel:
2️⃣ PySpark cannot find Java
Check Java inside WSL:
If Java is missing or JAVA_HOME is wrong → JVM gateway fails → 'JavaPackage' errors.
3️⃣ PySpark version mismatch
Run in WSL Python:
Compare with:
They must match exactly. If they don’t, the JVM fails to load classes → same error.
4️⃣ Minimal test to isolate
Try a minimal PySpark session without specifying master (local mode):
🔑 Most common cause at this stage
Based on everything:
-
You are in WSL, network is fixed → master should be reachable
-
Java exists in WSL → verify $JAVA_HOME
-
Most likely culprit: PySpark in your notebook is using a Windows Python kernel, not WSL Python.
Fix that first — this alone almost always resolves 'JavaPackage' object is not callable.
If you want, I can give you exact steps to ensure Jupyter in WSL uses the correct Python kernel so PySpark connects to the WSL Spark master properly.
Do you want me to do that?
You said:
Check result of Step 1:
('Linux', '/home/frank/zvenv/bin/python3')
Check result of Step 2:
openjdk version "17.0.16" 2025-07-15
OpenJDK Runtime Environment (build 17.0.16+8-Ubuntu-0ubuntu124.04.1)
OpenJDK 64-Bit Server VM (build 17.0.16+8-Ubuntu-0ubuntu124.04.1, mixed mode, sharing)
/usr/lib/jvm/java-17-openjdk-amd64
Check result of Step 3:
(zvenv) frank@ZZHPC:~$ python3 -c "import pyspark; print(pyspark.__version__)"
4.0.1
So the cause is probably version mismatch. I'll install pyspark-3.5.7.
Exactly — that explains everything.
Your setup:
| Component | Version |
| Spark |
3.5.7 |
| PySpark |
4.0.1 |
| Java |
17.0.16 |
PySpark 4.0.1 is meant for Spark 4.x.
Using it with Spark 3.5.7 causes the JVM class mismatch → 'JavaPackage' object is not callable even before you connect to any data.
✅ Correct fix
Install the PySpark version that matches your Spark installation (3.5.7):
💡 Tip: Always make sure PySpark version matches the installed Spark version.
Even a minor mismatch can break the JVM initialization.
You said:
The error is gone. Thank you very much.
ChatGPT said:
Perfect! 🎉
Glad it’s finally working.