parent
df2a7dcef0
commit
cda8967416
1 changed files with 10 additions and 6 deletions
|
|
@ -4,12 +4,11 @@
|
||||||
|
|
||||||
### **OSError: Unable to load libhdfs: ./libhdfs.so: cannot open shared object file: No such file or directory**
|
### **OSError: Unable to load libhdfs: ./libhdfs.so: cannot open shared object file: No such file or directory**
|
||||||
|
|
||||||
This error occurs while running Orca with `yarn-client` mode on Cloudera, where PyArrow failed to locate `libhdfs.so` in default path of `$HADOOP_HOME/lib/native`. To solve this, we need to set the path of `libhdfs.so` in Cloudera to the environment variable of `ARROW_LIBHDFS_DIR` on spark executors.
|
This error occurs while running Orca TF2 Estimator with spark backend for YARN on Cloudera, where PyArrow fails to locate `libhdfs.so` in default path of `$HADOOP_HOME/lib/native`.
|
||||||
|
To solve this issue, you need to set the path of `libhdfs.so` in Cloudera to the environment variable of `ARROW_LIBHDFS_DIR` on Spark driver and executors with the following steps:
|
||||||
|
|
||||||
You could follow below steps:
|
1. Run `locate libhdfs.so` on the client node to find `libhdfs.so`
|
||||||
|
2. `export ARROW_LIBHDFS_DIR=/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64` (replace with the result of `locate libhdfs.so` in your environment)
|
||||||
1. use `locate libhdfs.so` to find `libhdfs.so`
|
|
||||||
2. `export ARROW_LIBHDFS_DIR=/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64` (replace with the result of locate libhdfs.so)
|
|
||||||
3. If you are using `init_orca_context(cluster_mode="yarn-client")`:
|
3. If you are using `init_orca_context(cluster_mode="yarn-client")`:
|
||||||
```
|
```
|
||||||
conf = {"spark.executorEnv.ARROW_LIBHDFS_DIR": "/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64"}
|
conf = {"spark.executorEnv.ARROW_LIBHDFS_DIR": "/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64"}
|
||||||
|
|
@ -17,7 +16,12 @@ You could follow below steps:
|
||||||
```
|
```
|
||||||
If you are using `init_orca_context(cluster_mode="spark-submit")`:
|
If you are using `init_orca_context(cluster_mode="spark-submit")`:
|
||||||
```
|
```
|
||||||
spark-submit --conf "spark.executorEnv.ARROW_LIBHDFS_DIR=/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64"
|
# For yarn-client mode
|
||||||
|
spark-submit --conf spark.executorEnv.ARROW_LIBHDFS_DIR=/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64
|
||||||
|
|
||||||
|
# For yarn-cluster mode
|
||||||
|
spark-submit --conf spark.executorEnv.ARROW_LIBHDFS_DIR=/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64 \
|
||||||
|
--conf spark.yarn.appMasterEnv.ARROW_LIBHDFS_DIR=/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64
|
||||||
```
|
```
|
||||||
|
|
||||||
## **Orca Context Issues**
|
## **Orca Context Issues**
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue