parent
							
								
									df2a7dcef0
								
							
						
					
					
						commit
						cda8967416
					
				
					 1 changed files with 10 additions and 6 deletions
				
			
		| 
						 | 
				
			
			@ -4,12 +4,11 @@
 | 
			
		|||
 | 
			
		||||
### **OSError: Unable to load libhdfs: ./libhdfs.so: cannot open shared object file: No such file or directory**
 | 
			
		||||
 | 
			
		||||
This error occurs while running Orca with `yarn-client` mode on Cloudera, where PyArrow failed to locate `libhdfs.so` in default path of `$HADOOP_HOME/lib/native`. To solve this, we need to set the path of `libhdfs.so` in Cloudera to the environment variable of `ARROW_LIBHDFS_DIR` on spark executors. 
 | 
			
		||||
This error occurs while running Orca TF2 Estimator with spark backend for YARN on Cloudera, where PyArrow fails to locate `libhdfs.so` in default path of `$HADOOP_HOME/lib/native`. 
 | 
			
		||||
To solve this issue, you need to set the path of `libhdfs.so` in Cloudera to the environment variable of `ARROW_LIBHDFS_DIR` on Spark driver and executors with the following steps:
 | 
			
		||||
 | 
			
		||||
You could follow below steps:
 | 
			
		||||
 | 
			
		||||
1. use `locate libhdfs.so` to find `libhdfs.so`
 | 
			
		||||
2. `export ARROW_LIBHDFS_DIR=/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64` (replace with the result of locate libhdfs.so)
 | 
			
		||||
1. Run `locate libhdfs.so` on the client node to find `libhdfs.so`
 | 
			
		||||
2. `export ARROW_LIBHDFS_DIR=/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64` (replace with the result of `locate libhdfs.so` in your environment)
 | 
			
		||||
3. If you are using `init_orca_context(cluster_mode="yarn-client")`: 
 | 
			
		||||
   ```
 | 
			
		||||
   conf = {"spark.executorEnv.ARROW_LIBHDFS_DIR": "/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64"}
 | 
			
		||||
| 
						 | 
				
			
			@ -17,7 +16,12 @@ You could follow below steps:
 | 
			
		|||
   ```
 | 
			
		||||
   If you are using `init_orca_context(cluster_mode="spark-submit")`:
 | 
			
		||||
   ```
 | 
			
		||||
   spark-submit --conf "spark.executorEnv.ARROW_LIBHDFS_DIR=/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64"
 | 
			
		||||
   # For yarn-client mode
 | 
			
		||||
   spark-submit --conf spark.executorEnv.ARROW_LIBHDFS_DIR=/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64
 | 
			
		||||
 | 
			
		||||
   # For yarn-cluster mode
 | 
			
		||||
   spark-submit --conf spark.executorEnv.ARROW_LIBHDFS_DIR=/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64 \
 | 
			
		||||
                --conf spark.yarn.appMasterEnv.ARROW_LIBHDFS_DIR=/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64
 | 
			
		||||
   ```
 | 
			
		||||
 | 
			
		||||
## **Orca Context Issues**
 | 
			
		||||
| 
						 | 
				
			
			
 | 
			
		|||
		Loading…
	
		Reference in a new issue