Software Development With Devin: Setup And First Pull Request (Part 1)

Discover how Devin can assist in your coding tasks. In this first tutorial, we’ll get started on an existing repo and explore the features available in the Devin environment.

Jun 26, 2025 · 12 min read

I think we’ve all heard about Devin, the “AI junior software engineer” that supposedly clones your repo, adds new features, runs the tests, and opens a pull request before you’ve finished your morning coffee.

Like all AI things these days, it’s quite hard to know how good it is without trying it yourself, so I decided to do just that. I have run Devin through every stage of the full software development life cycle and wrote a series of four tutorials in the hopes that it will help fellow developers get the most out of it.

You can access all the tutorials in the Devin series here:

Setup and First Pull Request (Part 1)
Shipping a Vertical Slice with Devin (Part 2)
Integrations, Testing, and CI/CD (Part 3)
Security, Deployment, Maintenance (Part 4)

Now, if your GitHub profile is a “graveyard” of forgotten repositories, you’re in good company. Mine is packed with half-finished experiments. Over the next four tutorials, we’ll put Devin to work on my abandoned fp-ts exercises repo, turning last year’s clone-and-run exercises into a modern, browser-based learning platform.

Allow me to very quickly explain my plan for the four tutorials, and then we’ll get to the hands-on part for the rest of the series.

What We’ll Cover In This Tutorial Series

We’re going to resurrect a side-project from my own GitHub graveyard, because Devin performs best when it has real code, real tests, and a bit of history to reason about, not an empty repo.

The project is fp-ts-exercises, a collection of small challenges meant to teach functional-programming concepts (like Option, Either, etc.) in TypeScript. Right now, it’s a clone-and-run repo: learners pull it down, edit files locally, and run the tests in their own terminal.

My goal is to turn those exercises into a browser-based, interactive playground where people can read an explanation, tweak code in an in-page editor, and see the tests pass (or fail) instantly. Learners will also have the opportunity to log in to save their progress.

Note: I have updated the repo with Devin as I wrote those tutorials. The link above shows the state of the code in April 2023, before I dropped the project. The up-to-date code can be found here.

Here’s the plan for our four parts:

Part 1 - Setup & First Pull Request

Sign up for Devin, connect GitHub.
Let Devin clean up the repo: bump dependencies, create new lint rules and scripts, and open its first PR.

Part 2 - Shipping Features

Use Devin to plan the new browser UI and API.
Build pages, set up a database, secrets, and have Devin open feature PRs.

Part 3 - Integrations, Testing, and CI/CD

Introduce Vitest/Playwright tests and let Devin keep them green.
Generate a GitHub Actions pipeline, hook tickets to Jira, and connect Devin to Slack.
Explore the Wiki feature.

Part 4 - Security, Deployment, & Maintenance

Add authentication, deploy our app, and wire Sentry for error monitoring.

By the end, the old CLI repo should be a polished web app!

Before we start the hands-on part, let’s give Devin a general presentation. Feel free to jump to the setup section if you already know what Devin is and how it works.

What Is Devin?

Devin is a cloud-hosted, fully autonomous coding agent developed by Cognition. If you drop it into a repo, it opens a sandboxed shell, editor, and browser, then plans and executes tasks end-to-end without touching your local machine.

It launched in March 2024, with Cognition’s blog post stating that Devin closed 13.86% of real-world GitHub issues on the SWE-bench benchmark, absolutely dwarfing the previous SOTA of 1.96 %. Practically, it means that on a carefully chosen subset (570 issues), it fixed 79 bugs from popular projects like Django and scikit-learn without human help.

That’s impressive, but it’s benchmark code (which consists of tight, well-defined bugs with test suites), so the real-world mileage varies. In any case, some say that it performs as well as a junior dev.

Devin Use Cases

Devin is a great partner if you want to implement:

Incremental repo upgrades: It can run the tests, bump dependencies, and iteratively fix breakages, which is great for “bring this Node 14 project up to Node 20” type of tasks.
Well-scoped feature tickets: Given a GitHub issue or Jira ticket with clear acceptance criteria, Devin’s plan-edit-test loop is pretty good.
Non-coding tasks: Regenerating docs, wiring CI templates, or bulk-formatting files might be boring for humans, but they’re perfect for an agent.

However, Devin is not very well-suited for:

Green-field architecture: It struggles if you just say, “build me a SaaS platform from scratch.” It needs scaffolds and does better when given more context.
Ambiguous product decisions: Ask it to choose between Stripe vs. Paddle for payments, and it might just pick randomly.
Massive monorepos: Context limits mean it must page files in and out, and performance tumbles on 100 k-line workspaces.

How Does Devin Work?

All modern coding agents, Devin included, run a loop that mirrors a robotics control cycle:

Stage	What Devin “sees”	Key tech
Perceive	Reads code files, test logs, terminal output, browser DOM	Code indexer, log parsers
Cognize / Plan	Breaks the user prompt into a task list, reasons about order and tooling	LLM-based planner with retrieval-augmented memory
Act	Executes shell commands, edits files, clicks web UIs	Secure sandbox (Docker/VM) exposing Shell / Editor / Browser tools
Reflect / Learn	Re-runs tests, inspects diffs, updates plan (or asks you)	Self-critique prompt + vector store memory
Persist	Saves timeline & artefacts for replay or hand-off	Cloud object store + timeline UI

Devin is built as a layered stack of cooperating modules:

Everything starts with the chat interface or Slack bot/Jira ticket, where you state the goal in the form of a prompt.
The prompt is handed to a planner LLM that expands the goal into a step-by-step plan and self-critiques each step before execution.
A lightweight executor then selects the right tool for each step (shell, code editor, or headless browser), all running inside a tightly sandboxed workspace.

That sandbox is effectively a cloud laptop and isolates credentials, gives Devin a Bash prompt, a VS-Code–style editor, and a Chrome instance it can click through.

Beneath the workspace sits a memory layer that stores vectorised snapshots of the code base plus a full replay timeline of every command, file diff, and browser tab Devin touches.

Why is this architecture so good? First, the tight feedback loop: as soon as tests fail or lints complain, Devin can iterate autonomously until the build turns green.

Second, the sandbox design makes parallelising small chores trivial, because multiple sandboxes can run side-by-side without stepping on each other.

Third, the persistent memory allows Devin to tackle long-running migrations: the agent can keep a running to-do list of subtasks and chip away at them over hours or days, which means it finishes bulk refactors several times faster than a human would by hand.

Getting Started With Devin

First things first, let’s set up an account.

Let’s head to devin.ai and click “Get Started.”

The first few steps of the wizard are pretty easy to follow. I was asked to answer a few questions about myself, the company, and my team’s name. If you’re a solo dev like me, that is no problem, the wizard caters for that.

Then it is time to choose a plan. There are two options:

Core (pay-as-you-go): For most solo devs, the Core plan is the cheapest—you pay $20 up-front, which buys 9 Agent Compute Units (ACUs).
Teams: The Teams plan gives you 250 ACUs/month. This official documentation page provides detailed information on ACUs and Devin’s billing model.

The Core plan is a cheap-ish way to trial Devin, so I decided to pay $20 and see how far that got me.

I entered my payment details and got the option to set an auto-reload threshold so Devin can top up ACUs automatically instead of stopping mid-task. I didn't enable this feature and probably wouldn’t recommend doing so at that point. The option is available later in the settings, though, so you can try one task first and see what your usage is like. You can also set ACU limits per session, which is useful to avoid going over budget.

Quick maths: $20 / 9 ACUs ≈ $2.25 per ACU. The Teams subscription ($500/mo) drops that to $2.00 and pre-loads 250 ACUs, but you’ll only save money if you expect Devin to run for 30+ hours each month.

Connecting to Github (and optionally to Jira and Slack)

Now, the next thing we are asked to do is connect a GitHub account and give Devin access to repositories.

I choose to only give Devin access to our fp-ts exercises repository since that's the only one it needs to know about, and you should probably do the same to avoid any unintended code modification.

Once that was done, I had the option to connect a project management system. Devin seems to support integration with Jira, GitHub Issues, and Linear.

Clicking the Jira option opens this modal, and the lines in yellow specify that it is best for Devin to use a specific service account and only have access to one workspace, even if it means you have to create a dedicated Jira account for this.

Note that it is possible to add the Jira integration later from the settings, so you don’t have to do it at this stage.

You also get the option to integrate with Slack, but I skipped it for now.

Multi-Agent Systems with LangGraph

Build powerful multi-agent systems by applying emerging agentic design patterns in the LangGraph framework.

Explore Course

First Task: Updating an Outdated Repository

Okay, now that we’re set up, let’s jump into the fun part. Time to see how good Devin is!

Prompting

Once the initial setup is complete, this is the page we land on:

Unsurprisingly, it all starts with a prompt. I hovered over all the little icons to understand the features available, then wrote my prompt.

Prompt: A couple of years ago, I started a repo called fp-ts-exercises, aimed at helping developers learn functional programming in TypeScript using the fp-ts library. It contains a series of local exercises covering concepts like Option, Either, etc. Learners would clone the repo, solve the exercises, and run tests to check their solutions.

I’d like to modernize and relaunch this project with the following goals:

Update & Refactor the Existing Codebase

Update all dependencies and ensure compatibility with the latest version of fp-ts
Improve structure, readability, and developer experience
Replace any outdated tooling or patterns (e.g. migrate from Mocha to Vitest, or similar if appropriate)

Now, that wasn’t a great prompt, but that was the point. The output of an AI agent can only be as good as the input it receives, and a vague target can be interpreted in a variety of ways. In the case of Devin, the cost of a miscommunication is significant. If the boundaries of executions and expected output are not clearly defined, and you set it on a semi-complex task, you’ll be wasting a lot of ACUs.

What I wanted to test out here, was the little “magic pen” icon next to the “Send” button. This is a “Analyze prompt” feature that is conveniently available to help you refine your prompt before sending it, and therefore make the best out of Devin’s capabilities.

When I clicked it, it rephrased parts of my prompt, but also let me know that there were opportunities for improvement.

That was great feedback, and so I went on to modify my prompt slightly to provide extra requirements. I had no linting or unit tests set up at that point, so I specified that it should run the learners tests on the solution of the first problem to make sure the app behaved as expected.

Prompt: I want to modernize and relaunch my fp-ts-exercises repository, which helps developers learn functional programming in TypeScript using the fp-ts library. The repository contains local exercises covering concepts like Option and Either. Learners clone the repo, solve the exercises, and run tests to check their solutions. The goal is to update the project and improve the developer experience. Please do the following:

Update all dependencies to the latest versions and ensure compatibility with the latest version of fp-ts.
Refactor the existing codebase to improve structure, readability, and the overall developer experience.
Replace any outdated tooling or patterns. For example, migrate from Mocha to Vitest, or suggest other appropriate tooling updates. Please provide a rationale for any tooling changes you propose.
Ensure that all existing exercises and tests continue to function correctly after the updates and refactoring.

You can run npm run solution -- option 01 to make sure that the tests pass and everything works as expected.

Specify what steps you will take to perform the updates and refactoring. Before making significant changes, please propose your approach and get feedback.

You can submit your changes for review in a Pull Request.

Sure enough, analyzing my prompt this time yielded a “Prompt looks good!” toast.

The interface

I sent my prompt, and Devin immediately started thinking.

The UI is a little overwhelming at first glance, but it is fairly easy to navigate once you know what everything does. Let’s break it down.

The panel with active sessions on the left is collapsible, if you prefer to dive into one session at a time.

There is a chat-like interface, where you can follow Devin’s reasoning and interact with it.

The right side of the screen is taken up by a panel with several tabs: Progress, Shell, Browser, and Editor. These tabs show the exact reasoning and steps that Devin executes across the different environments.

Clicking on each can show you the steps it took in the shell (for example, cloning the repo), in the browser (navigating to the github repo’s page), and in the Editor, which is a VSCode-like editor with all the repo’s files (which is where it will later modify our code). The progress tab highlights the steps it takes as it executes them.

At the top right of the screen, there is another set of tabs: Timelapse, IDE, and Browser. These are the views that you can navigate to and use. The currently selected Timelapse tab shows Devin’s progress and actions, as we explained above. The interactive timelapse feature (you can see the “Live” and progress bar at the bottom) allows you to replay a session. The IDE tab opens an online VSCode-like IDE where you can modify your code and even commit it to GitHub.

Planning and Devin’s reasoning

Devin thought for a few minutes, analyzed the whole repo, and explained its actions at every step. Note the highlighted “Devin cloned the repo, consider setting up a Devin workspace” message. We’ll come back to that later.

It then proposed a plan, told me its confidence was high, and asked me to confirm:

I was pretty happy with that plan, so I clicked Confirm. Devin set out to do its work straight away.

Can you see the little “Agent” dropdown at the bottom of the chat interface? The other option in the menu is “Ask.” This functionality won’t interrupt Devin’s work but lets you interact with it. I decided to test it by asking it to add linting and TypeScript compilation check scripts.

The work had already started, but it told me it would make sure to add these scripts (and it did!).

The pull request

I didn’t time it exactly, but after a few minutes, it had a pull request ready, and the description was thorough and accurate.

It was a pleasant surprise that it used its initiative to do more than make sure that the first solution’s tests passed. It tested the third solution, but also the first exercise, and checked that the tests failed (as expected, since no learner completed the exercise).

I reviewed the PR and had a suspicion that my linting and TypeScript compile scripts would fail. There was, for instance, a "type":"module" missing in my package.json.

I am not sure why Devin did not run those scripts after writing them to make sure they worked. Because I didn’t ask it to do so explicitly? Or because I added these requirements after it started its work?

I decided to use the online code editor to fix Devin’s oversight, which took me about 5 minutes. I ran every command in the terminal, and all seemed to behave as expected following my fix.

I could have delegated the fix to Devin, but it was quicker to handle it myself. In about five minutes, I ran the scripts, confirmed the lint and compile failures, and pinpointed the cause. Another two minutes later, the patch was in place and all checks passed. Drafting a detailed prompt, waiting for Devin to run, and then reviewing its output would have taken longer.

Finding the sweet spot between human and AI effort is a subtle art: when a tweak is obviously low-hanging fruit, I’d rather make it than type “please fix this” and wait. Everyone has personal preferences, though, and your own cutoff will vary with your expertise, the task, and—let’s be honest here—your mood on the day.

Ending the session

At that point, I was done with the modernization of the repo, and everything worked as expected. But what now? How do I end a session, and am I still charged for this idle time?

The interface got a little confusing at this point. On the right end side, in the session menu, I had two options: put Devin to sleep or terminate the session.

I consulted the documentation, and here is what I found: Devin doesn’t use ACUs while sleeping, and it will sleep automatically after ~0.1 ACUs of inactivity.

I also asked the docs AI bot what the difference between sleep and termination was:

The answer was pretty clear. Our work here was done, so I terminated the session.

Setting Up Our Repo In Devin’s Workspace

Do you remember the highlighted message we got in the chat while Devin was planning its work? It asked us to set up the repo on Devin’s machine, so we wouldn’t have to clone it each time we wanted to work on it. But why would we want to do that?Each new session starts with a fresh virtual machine, so without setup, Devin has to:

Figure out our codebase from scratch
Install all dependencies and tools
Learn our project structure and conventions
Waste valuable time on environment setup instead of actual work

So setting up our repo is a more cost and time-efficient approach. It is more consistent as well since every session starts with the same configuration, and Devin knows exactly how to run, test, and lint our code.

This feature can be found in your team’s navigation panel, and the steps are, once again, fairly straightforward to follow.

We start by specifying the commands to pull the latest version of the repo:

You can enter custom commands, and clicking the “Verify Command” button will run them automatically in the terminal in front of you. You can see if they succeed or fail in seconds.

I skipped the secrets setup step, as we don’t have any for now. Installing and maintaining dependencies work the same way, and once your command has succeeded, the step is marked as complete.

The rest of the steps work similarly, and once all the steps are completed (or skipped, if not applicable), your configuration is saved.

Session Usage And Review

So far, we have completed a simple task with Devin, created a pull request, merged it, and set up our repo for future sessions. Let’s have a look at some of the data for our session. Clicking on the little icon at the top of the session opens a modal with session insights:

Our session size is XS, which, according to Devins’ size guide, was pretty good.

If you click on the “Generate Insights” button and wait a few seconds, you’ll be able to see some general information about your session, the specific steps that Devin took, and where it struggled.

I was also interested in seeing how many ACUs we had used. One strange thing I noticed was that the plans page showed I had 8.73 ACUs left, despite having already used 1.7 ACUs out of the total 9 (see the blue highlight inside the Core section).

The Usage & Limits page, however, correctly showed that we used 1.7 ACUs and had 7.3 ACUs left.

I checked back 18 hours later, and the Plans page then showed 7.3 ACUs remaining, so it does have a slight delay in updating this information. Make sure you use the Usage & Limits page as your source of truth to avoid surprises!

Was it worth it?

Let’s run the numbers.

So, 1.7 ACU at $2.25/ACU = $3.83 for this session.

I spent maybe 20 minutes prompting and reviewing. Doing the whole thing manually would have stretched well past that, and I’d have procrastinated every step because this is the type of task that I find particularly boring. For the price of a cappuccino, Devin took the work off my plate. I’ll make that trade any day.

What’s Next?

Okay, so we have:

Created our account
Connected our GitHub repository
Set up the repo for future sessions on Devin’s machine
Ran our first dependency upgrade task
Reviewed our session metrics and usage

Pretty good start! In the next tutorial, we’ll dive deeper and push Devin’s capabilities by asking it to plan and implement full features, complete with API and database integrations.

If you’re ready to continue, click on the second list item below to go to the second tutorial:

Setup and First Pull Request
Shipping a Vertical Slice with Devin (Part 2)
Integrations, Testing, and CI/CD (Part 3)
Security, Deployment, Maintenance (Part 4)

Author

Marie Fayard

Topics

AI Agents

Artificial Intelligence

Build AI agents with these courses:

Course

AI Security and Risk Management

2 hr

6.6K

Learn the fundamentals of AI security to protect systems from threats, align security with business goals, and mitigate key risks.

See Details

Start Course

Course

Introduction to AI Agents

1 hr 30 min

37K

Learn the fundamentals of AI agents, their components, and real-world use—no coding required.

See Details

Start Course

Course

Multi-Agent Systems with LangGraph

2 hr 45 min

3.4K

Build powerful multi-agent systems by applying emerging agentic design patterns in the LangGraph framework.

See Details

Start Course

Tutorial

Software Development With Devin: Integrations, Testing, and CI/CD (Part 3)

Learn how Devin integrates with teams by managing Jira tickets, updating Slack, and running CI/CD checks with GitHub Actions before merging.

Marie Fayard

Tutorial

Software Development With Devin: Security, Deployment, Maintenance (Part 4)

Learn how to use Devin to add real user auth with NextAuth, monitor errors with Sentry, and deploy the frontend to Vercel with preview URLs.

Marie Fayard

Tutorial

Software Development With Devin: Shipping a Vertical Slice (Part 2)

Learn how Devin can help you build a Next.js 14 playground with live Vitest feedback and a NestJS backend that saves user progress using an anonymous UUID in localStorage.

Marie Fayard

Tutorial

How to Use GitHub Copilot: Use Cases and Best Practices

Explore how GitHub Copilot works with Visual Studio Code. Learn about its features, pricing, and practical applications for students and developers.

Eugenia Anello

Tutorial

Lovable AI: A Guide With Demo Project

Learn how to build and publish a mobile app using Lovable AI, integrating it with Supabase for backend services and GitHub for version control.

François Aubry

code-along

Pair Programming with GitHub Copilot

In this session, Nuno, DataCamp's Director of Engineering, demonstrates how to make use of GitHub Copilot. You'll see how to perform a simple data analysis in conjunction with AI, and learn how to make the most of Copilot's features.

Nuno Rocha

See More See More

What We’ll Cover In This Tutorial Series

What Is Devin?

Devin Use Cases

How Does Devin Work?

Getting Started With Devin

Sign up and choose a plan

Connecting to Github (and optionally to Jira and Slack)

Multi-Agent Systems with LangGraph

First Task: Updating an Outdated Repository

Prompting

The interface

Planning and Devin’s reasoning

The pull request

Ending the session

Setting Up Our Repo In Devin’s Workspace

Session Usage And Review

Was it worth it?

What’s Next?

Software Development With Devin: Integrations, Testing, and CI/CD (Part 3)

Software Development With Devin: Security, Deployment, Maintenance (Part 4)

Software Development With Devin: Shipping a Vertical Slice (Part 2)

How to Use GitHub Copilot: Use Cases and Best Practices

Lovable AI: A Guide With Demo Project

Pair Programming with GitHub Copilot

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}AI Security and Risk Management

Introduction to AI Agents

Multi-Agent Systems with LangGraph

Software Development With Devin: Integrations, Testing, and CI/CD (Part 3)

Software Development With Devin: Security, Deployment, Maintenance (Part 4)

Software Development With Devin: Shipping a Vertical Slice (Part 2)

How to Use GitHub Copilot: Use Cases and Best Practices

Lovable AI: A Guide With Demo Project

Pair Programming with GitHub Copilot

AI Security and Risk Management