Skip to content
import os
from langchain import OpenAI, PromptTemplate, LLMChain
from langchain.text_splitter import CharacterTextSplitter

# Set your OpenAI API key
openai_api_key = os.environ["OPENAI_API_KEY"]

# Sample text to be split and summarized
sample_text = (
    "Chakalaka . "
    "Marinated chicken thighs with fenugreek and ginger . "
    "roast chicken with pepper and olives, and marinated strawberries ."
)

# Define the text splitter
text_splitter = CharacterTextSplitter(
    separator=' ',
    chunk_size=50,
    chunk_overlap=10,
    length_function=len
)

# Split the text into chunks
text_chunks = text_splitter.split_text(sample_text)

# Define a simple prompt template for summarization
prompt_template = PromptTemplate(
    input_variables=["text_chunk"],
    template="Summarize the following text: {text_chunk}"
)

# Initialize the OpenAI language model
llm = OpenAI()   #model_name="text-davinci-003"

# Create an LLM chain that uses the language model and the prompt template
llm_chain = LLMChain(llm=llm, prompt=prompt_template)

# Process each text chunk with the LLM chain to get summaries
summaries = [llm_chain.run({"text_chunk": chunk}) for chunk in text_chunks]

# Print the summaries
for i, summary in enumerate(summaries):
    print(f"Summary of Chunk {i+1}:")
    print(summary)
    print('-' * 40)

Explanation

  • Import Modules: We import the necessary modules from LangChain and OpenAI.
  • Set API Key: The OpenAI API key is set in the environment.
  • Sample Text: A sample text is provided for processing.
  • Text Splitter: We use CharacterTextSplitter to split the text into chunks, specifying parameters like separator, chunk size, chunk overlap, and length function.
  • Prompt Template: A simple prompt template is defined to instruct the language model to summarize the text.
  • OpenAI LLM: The OpenAI language model is initialized.
  • LLM Chain: An LLM chain is created that combines the language model and the prompt template.
  • Process Text Chunks: Each text chunk is processed through the LLM chain to generate summaries.
  • Print Summaries: The summaries are printed for each chunk.