WatsonxLLM

WatsonxLLM is wrapper for IBM watsonx.ai foundation models. This example shows how to communicate with watsonx.ai models using LangChain.

Install the package ibm_watson_machine_learning.

%pip install ibm_watson_machine_learning

This cell defines the WML credentials required to work with watsonx Foundation Model inferencing.

Action: Provide the IBM Cloud user API key. For details, see documentation.

import os
from getpass import getpass

watsonx_api_key = getpass()
os.environ["WATSONX_APIKEY"] = watsonx_api_key

Load the model

You might need to adjust model parameters for different models or tasks, to do so please refer to documentation.

from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as GenParams

parameters = {
    GenParams.DECODING_METHOD: "sample",
    GenParams.MAX_NEW_TOKENS: 100,
    GenParams.MIN_NEW_TOKENS: 1,
    GenParams.TEMPERATURE: 0.5,
    GenParams.TOP_K: 50,
    GenParams.TOP_P: 1,
}

Initialize the WatsonxLLM class with previous set params.

from langchain.llms import WatsonxLLM

watsonx_llm = WatsonxLLM(
    model_id="google/flan-ul2",
    url="https://us-south.ml.cloud.ibm.com",
    project_id="***",
    params=parameters,
)

Alternatively you can use Cloud Pak for Data credentials. For details, see documentation.

watsonx_llm = WatsonxLLM(
    model_id='google/flan-ul2',
    url="***",
    username="***",
    password="***",
    instance_id="openshift",
    version="4.8",
    project_id='***',
    params=parameters
)

Create Chain

Create PromptTemplate objects which will be responsible for creating a random question.

from langchain.prompts import PromptTemplate

template = "Generate a random question about {topic}: Question: "
prompt = PromptTemplate.from_template(template)

Provide a topic and run the LLMChain.

from langchain.chains import LLMChain

llm_chain = LLMChain(prompt=prompt, llm=watsonx_llm)
llm_chain.run("dog")

'How many breeds of dog are there?'

Calling the Model Directly

To obtain completions, you can can the model directly using string prompt.

# Calling a single prompt

watsonx_llm("Who is man's best friend?")

'dog'

# Calling multiple prompts

watsonx_llm.generate(
    [
        "The fastest dog in the world?",
        "Describe your chosen dog breed",
    ]
)

LLMResult(generations=[[Generation(text='greyhounds', generation_info={'generated_token_count': 4, 'input_token_count': 8, 'finish_reason': 'eos_token'})], [Generation(text='The Basenji is a dog breed from South Africa.', generation_info={'generated_token_count': 13, 'input_token_count': 7, 'finish_reason': 'eos_token'})]], llm_output={'model_id': 'google/flan-ul2'}, run=[RunInfo(run_id=UUID('03c73a42-db68-428e-ab8d-8ae10abc84fc')), RunInfo(run_id=UUID('c289f67a-87d6-4c8b-a8b7-0b5012c94ca8'))])

Streaming the Model output

You can stream the model output.

for chunk in watsonx_llm.stream(
    "Describe your favorite breed of dog and why it is your favorite."
):
    print(chunk, end="")

The golden retriever is my favorite dog because it is very friendly and good with children.

Load the model​

Create Chain​

Calling the Model Directly​

Streaming the Model output​

Load the model

Create Chain

Calling the Model Directly

Streaming the Model output