Skip to main content
HomeBlogArtificial Intelligence (AI)

5 Projects You Can Build with Generative AI Models (with examples)

Learn how to use Generative AI models to create an image editor, ChatGPT-like chatbot on low resources, loan approval classifier app, automate PDF Interactions, and GPT-powered voice assistant.
Updated Apr 2023  · 10 min read

An AI juggles tasks

In this post, we will be learning about different types of generative models that we can use to create interesting projects. Moreover, we will also learn how we can use ChatGPT to create an end-to-end data science project. 

Five generative AI projects will improve your machine learning and data science portfolio and make you an attractive candidate for the job application. They will also help you understand the latest tools like Stable Diffusion Inpainting, Segment Anything, Stanford Alpaca, LoRA, LangChain, OpenAI API, and Whisper.

You can learn about the GPT series, including GPT-1, GPT-2, GPT-3, and GPT-4, by reviewing the article 'What is GPT-4 and Why Does it Matter?' You can also view some other AI projects in a separate post. 

1. StableSAM: Stable Diffusion Inpainting with Segment Anything

In this project, you will use Meta’s segment-anything, Hugging Face diffusers, and Gradio to create an app that can change the background, face, clothes or anything you select. It just wants the image, selected area, and prompt. 

Create Stable Diffusion Inpainting Pipeline

We will create a Stable Diffusion Inpainting pipeline using diffuser and model weights available on stabilityai/stable-diffusion-2-inpainting · Hugging Face. After that, we will add it to “cuda'' for GPU acceleration. 

Defining Image Mask and Inpainting Function

The image mask function is created using SAM Predictor. It takes an image, selected image section, and is_backgroud boolean value to create masked image and segmentation. 

After that, the inpainting function uses the Stable Diffusion Inpainting pipeline to change selected parts of the images. The pipelines require input image, masked image, segmented image, prompt text, and negative prompt text.  

Creating Gradio UI

You will create a row and add three image blocks. For a minimal viable product, you have to add another row with a submit button. After that, you have to modify the input image object to select pixels, generate mask and segmentation, and add action to the submit button to run the inpainting function.

Gradio is quite easy to learn. You can learn everything by reading the Gradio Docs

Improved Version of StableSAM

The improved version of StableSAM is available at hugging face, which includes a customized inpainting pipeline that uses ControlNet. It uses Runway ML Stable Diffusion Inpainting instead of Stability AI. 

StableSAM package

As you can see, the final version app looks clean with a segmentation block, clean button, background option, and negative prompt. 

StableSAM demo

Resources: 

2. Alpaca-LoRA: Build ChatGPT-like with Minimal Resource 

The Alpaca-LoRA provides all of the necessary components for you to create your own specialized ChatGPT-like chatbot using a single GPU. 

In this section, we will look at the initial setup, training, inference script, and native client for running inference on the CPU.

Local Setup

  1. Clone repository: tloen/alpaca-lora
  2. Install dependencies using pip install -r requirements.txt
  3. If bitsandbytes doesn't work, install it from the source

Training

In this part, we will look at the fine-tuning script that you can run on the LLaMA model using the cleaned Stanford Alpaca model. 

You can look at the repository to tweak the hyperparameters for better performance.

python finetune.py \
    --base_model 'decapoda-research/llama-7b-hf' \
    --data_path 'yahma/alpaca-cleaned' \
    --output_dir './lora-alpaca'

Inference

The inference script reads the foundation LLaMA model from Hugging Face and loads LoRA weights to run a Gradio interface. 

python generate.py \
    --load_8bit \
    --base_model 'decapoda-research/llama-7b-hf' \
    --lora_weights 'tloen/alpaca-lora-7b'

You can also use alpaca.cpp for running alpaca models on CPU or 4GB RAM Raspberry Pi 4. Furthermore, you can use Alpaca-LoRA-Serve to create a ChatGPT-style interface, as shown below. 

Alpaca-LoRA-Serve Screenshot

Resources:

3. Automating PDF Interaction with LangChain and ChatGPT

Create your own ChatPDF clone using LangChain’s PDF loader, OpenAI Embeddings, and GPT-3.5. You will create a chatbot that can communicate with your book, legal documentation, and other important PDF documents. 

Loading the document

We will use the LangChain document loader to load PDFs and read the contents.

from langchain.document_loaders import PyPDFLoader

pdf_path = "./paper.pdf"
loader = PyPDFLoader(pdf_path)
pages = loader.load_and_split()
print(pages[0].page_content)

Creating embeddings and Vectorization

We will create embeddings using OpenAIEmbeddings class from LangChain API. After that, pass these embeddings to the Chroma class to create a vector database for the PDF document. 

from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma

embeddings = OpenAIEmbeddings()
vectordb = Chroma.from_documents(pages, embedding=embeddings, persist_directory=".")

vectordb.persist()

Querying the PDF

We will use the ChatVectorDBChain class to interact with ChatGPT using a generated vector database. 

from langchain.chains import ChatVectorDBChain
from langchain.llms import OpenAI

pdf_qa = ChatVectorDBChain.from_llm(
    OpenAI(temperature=0.9, model_name="gpt-3.5-turbo"),
    vectordb,
    return_source_documents=True,
)

query = "What is the VideoTaskformer?"
result = pdf_qa({"question": query, "chat_history": ""})
print("Answer:")
print(result["answer"])

The next step is not mentioned in the video tutorial. You can use the Gradio framework to create a web app and share it with your colleagues and friends. 

Screenshot from youtube tutorial

Resources:

4. Bing-GPT Voice Assistant

Build your own AI-powered personal assistant just like J.A.R.V.I.S. For that, you will need OpenAI API, text-to-speech library, speech recognition library, and generative AI. 

After loading the required libraries, you have to provide the OpenAI API key:

openai.api_key = "[paste your OpenAI API key here]"

Wake word

Create a wake word function to activate the AI. In this case, the developer is using “‘bing” or “gpt”.

Speech Synthesis

Speech synthesis function provides text-to-speech inference. You can access the polly (text-to-speech) from boto3 and play the audio using pydub.

Transcription with Whisper

The speech recognition is done by openai/whisper. You just have to figure out the API to add it to your application. 

ChatBot with EdgeGPT

In the end, you will use acheong08/EdgeGPT and OpenAI API to create a chatbot. If the user uses the wake word “bing'' it will use the EdgeGPT model and else the ChatGPT model. 

If you are looking for a much simpler implementation of Voice Assistant, check out OpenAI Whisper, ChatGPT, TTS, and Gradio Web UI

Image from youtube tutorial

Resources:

5. An End-to-End Data Science Project

In this project, we will use ChatGPT to work on end-to-end loan approval classifier applications. All we need is access to the ChatGPT interface and a personal machine to run the code. 

Project Planning

In the planning phase, we will describe the dataset and what we want from it. Sometimes the answers are not perfect, but you can tweak the response by providing follow-up prompts. 

After that, we will start following the brief plan. 

Exploratory Data Analysis (EDA) 

We will ask ChatGPT to generate Python code that will load the dataset and perform exploratory data analysis with various visualization techniques. You can even ask it to interpret the results.

Feature Engineering

We have asked ChatGPT to write a feature engineering code, and amazingly it has created two features from existing features. It means that AI now fully understands the dataset. 

Preprocessing and Balancing the Data

We have an imbalanced dataset, and in this part, we will use ChatGPT to generate preprocessing and class balancing code. 

Model Selection

We have just asked ChatGPT to write the model selection code by specifying the machine-learning models. After running the code, we will select the best-performing model. 

Hyperparameter Tuning and Model Evaluation

To improve the performance, we will ask ChatGPT to write Python code for hyperparameter tuning and model evaluation, and save the best-performing model.

Creating a Web App using Gradio

We will ask ChatGPT to write Gradio app code using the saved model and preprocessing. The AI understood the input features and output results. As a result, we got a fully functioning web app. 

Deploying the Web App on Spaces

At the final step, we have asked ChatGPT to deploy a web app on space. It has provided a few steps that we can follow and deploy in a few minutes. 

Loan Approval Classifier App

This project will also teach tips on writing effective prompts, which is becoming essential for every field. 

Resources: 

Conclusion

This is just a start of what’s to come with generative AI models. The open-source community is working hard to develop tools that will help you build any type of AI. You can use these tools to even create AGI (Artificial general AI); check out Auto-GPT (experimental open-source for creating GPT-4 fully autonomous) and babyagi (AI-powered task management system). 

In this post, we have covered the projects that can be understood by all levels and require fewer resources to get started. They all use open-source tools, models, datasets, and packages that are available for anyone to use. 

If you're new to ChatGPT, consider taking an Introduction to ChatGPT course. Alternatively, if you're already familiar with generative AI, you can improve your prompting skills by reviewing the comprehensive ChatGPT Cheat Sheet for Data Science, or by checking out the following resources:

Topics
Related

You’re invited! Join us for Radar: AI Edition

Join us for two days of events sharing best practices from thought leaders in the AI space
DataCamp Team's photo

DataCamp Team

2 min

The Art of Prompt Engineering with Alex Banks, Founder and Educator, Sunday Signal

Alex and Adel cover Alex’s journey into AI and what led him to create Sunday Signal, the potential of AI, prompt engineering at its most basic level, chain of thought prompting, the future of LLMs and much more.
Adel Nehme's photo

Adel Nehme

44 min

The Future of Programming with Kyle Daigle, COO at GitHub

Adel and Kyle explore Kyle’s journey into development and AI, how he became the COO at GitHub, GitHub’s approach to AI, the impact of CoPilot on software development and much more.
Adel Nehme's photo

Adel Nehme

48 min

A Comprehensive Guide to Working with the Mistral Large Model

A detailed tutorial on the functionalities, comparisons, and practical applications of the Mistral Large Model.
Josep Ferrer's photo

Josep Ferrer

12 min

Serving an LLM Application as an API Endpoint using FastAPI in Python

Unlock the power of Large Language Models (LLMs) in your applications with our latest blog on "Serving LLM Application as an API Endpoint Using FastAPI in Python." LLMs like GPT, Claude, and LLaMA are revolutionizing chatbots, content creation, and many more use-cases. Discover how APIs act as crucial bridges, enabling seamless integration of sophisticated language understanding and generation features into your projects.
Moez Ali's photo

Moez Ali

How to Improve RAG Performance: 5 Key Techniques with Examples

Explore different approaches to enhance RAG systems: Chunking, Reranking, and Query Transformations.
Eugenia Anello's photo

Eugenia Anello

See MoreSee More