* [LLM] Multi-process and distributed QLoRA on CPU platform * Update README.md * Update README.md * Update README.md * Update README.md * enable llm-init and bind to socket * refine * Update Dockerfile * add all files of qlora cpu example to /bigdl * fix * fix k8s * Update bigdl-qlora-finetuing-entrypoint.sh * Update bigdl-qlora-finetuing-entrypoint.sh * Update bigdl-qlora-finetuning-job.yaml * fix train sync and performance issues * add node affinity * disable user to tune cpu per pod * Update bigdl-qlora-finetuning-job.yaml
		
			
				
	
	
		
			9 lines
		
	
	
	
		
			510 B
		
	
	
	
		
			YAML
		
	
	
	
	
	
			
		
		
	
	
			9 lines
		
	
	
	
		
			510 B
		
	
	
	
		
			YAML
		
	
	
	
	
	
imageName: intelanalytics/bigdl-llm-finetune-qlora-cpu:2.5.0-SNAPSHOT
 | 
						|
trainerNum: 2
 | 
						|
microBatchSize: 8
 | 
						|
nfsServerIp: your_nfs_server_ip
 | 
						|
nfsPath: a_nfs_shared_folder_path_on_the_server
 | 
						|
dataSubPath: alpaca_data_cleaned_archive.json # a subpath of the data file under nfs directory
 | 
						|
modelSubPath: Llama-2-7b-chat-hf # a subpath of the model file (dir) under nfs directory
 | 
						|
httpProxy: "your_http_proxy_like_http://xxx:xxxx_if_needed_else_empty"
 | 
						|
httpsProxy: "your_https_proxy_like_http://xxx:xxxx_if_needed_else_empty"
 |