* add langchain rag on gpu
* add rag example in readme
* add trust_remote_code in TransformersEmbeddings.from_model_id
* add trust_remote_code in TransformersEmbeddings.from_model_id in cpu
* first draft
* fix
* add readme for transformer_int4_gpu
* fix doc
* check device_map
* add arc ut test
* fix ut test
* fix langchain ut
* Refine README
* fix gpu mem too high
* fix ut test
---------
Co-authored-by: Ariadne <wyn2000330@126.com>