Text Embeddings Inference
Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5.
To use it within langchain, first install huggingface-hub
.
!pip install huggingface-hub -q
Then expose an embedding model using TEI. For instance, using Docker,
you can serve BAAI/bge-large-en-v1.5
as follows:
model=BAAI/bge-large-en-v1.5
revision=refs/pr/5
volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run
docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:0.6 --model-id $model --revision $revision
Finally, instantiate the client and embed your texts.
from langchain.embeddings import HuggingFaceHubEmbeddings
from getpass import getpass
huggingfacehub_api_token = getpass("Enter your HF API Key:\n\n")
Enter your HF API Key:
········
embeddings = HuggingFaceHubEmbeddings(
model="http://localhost:8080", huggingfacehub_api_token=huggingfacehub_api_token
)
text = "What is deep learning?"
query_result = embeddings.embed_query(text)
query_result[:3]
[0.018113142, 0.00302585, -0.049911194]
doc_result = embeddings.embed_documents([text])