Skip to content
import os
from langchain import OpenAI, PromptTemplate, LLMChain
from langchain.text_splitter import CharacterTextSplitter
# Set your OpenAI API key
openai_api_key = os.environ["OPENAI_API_KEY"]
# Sample text to be split and summarized
sample_text = (
"Chakalaka . "
"Marinated chicken thighs with fenugreek and ginger . "
"roast chicken with pepper and olives, and marinated strawberries ."
)
# Define the text splitter
text_splitter = CharacterTextSplitter(
separator=' ',
chunk_size=50,
chunk_overlap=10,
length_function=len
)
# Split the text into chunks
text_chunks = text_splitter.split_text(sample_text)
# Define a simple prompt template for summarization
prompt_template = PromptTemplate(
input_variables=["text_chunk"],
template="Summarize the following text: {text_chunk}"
)
# Initialize the OpenAI language model
llm = OpenAI() #model_name="text-davinci-003"
# Create an LLM chain that uses the language model and the prompt template
llm_chain = LLMChain(llm=llm, prompt=prompt_template)
# Process each text chunk with the LLM chain to get summaries
summaries = [llm_chain.run({"text_chunk": chunk}) for chunk in text_chunks]
# Print the summaries
for i, summary in enumerate(summaries):
print(f"Summary of Chunk {i+1}:")
print(summary)
print('-' * 40)
Explanation
- Import Modules: We import the necessary modules from LangChain and OpenAI.
- Set API Key: The OpenAI API key is set in the environment.
- Sample Text: A sample text is provided for processing.
- Text Splitter: We use CharacterTextSplitter to split the text into chunks, specifying parameters like separator, chunk size, chunk overlap, and length function.
- Prompt Template: A simple prompt template is defined to instruct the language model to summarize the text.
- OpenAI LLM: The OpenAI language model is initialized.
- LLM Chain: An LLM chain is created that combines the language model and the prompt template.
- Process Text Chunks: Each text chunk is processed through the LLM chain to generate summaries.
- Print Summaries: The summaries are printed for each chunk.