Software Development With Devin: Integrations, Testing, and CI/CD (Part 3)

Learn how Devin integrates with teams by managing Jira tickets, updating Slack, and running CI/CD checks with GitHub Actions before merging.

Jun 26, 2025 · 12 min read

Welcome back! By the end of the second tutorial, we had a pastel-themed fp-ts playground, a NestJS + PostgreSQL backend, and anonymous UUID progress tracking.

You can access all the tutorials in the Devin series here:

Setup and First Pull Request (Part 1)
Shipping a Vertical Slice with Devin (Part 2)
Integrations, Testing, and CI/CD (Part 3)
Security, Deployment, Maintenance (Part 4)

What we’ve done so far is great for solo hacking, but it is time to see how well Devin integrates with team workflows. In this third tutorial, we’ll look at:

Integrations: Devin will open Jira tickets, work on them, and broadcast every PR status straight to Slack.
Quality gates: We’ll add Jest unit tests for the API, Playwright end-to-end flows for the UI, and enforce 90% coverage.
CI/CD: GitHub Actions will lint, type-check, run all tests, and attach Playwright reports to pull requests before anything can be merged.

No auth and no prod deploys just yet, those will happen in Part 4!

Set Up a Slack Integration on Devin

Hooking Devin into your comms and ticket flow is entirely manual and done via the Devin interface.

You can connect to Slack from the integration tab of the Devin settings, or install the “Devin AI” app from Slack’s App Directory.

The app still shows “Not approved by Slack” in the OAuth dialog. Cognition says their security review is pending, and functionality is unaffected.

Then choose a channel:

You can chat with Devin with just a mention:

And it starts a session that you can access in the UI:

By default, you get notified of the PR updates in the channel of your choice, but there are a few different notification settings that you can tweak in each session’s parameters.

Set Up a Jira Integration on Devin

To integrate with Jira, you need to create a dedicated bot user (e.g., devin-bot@…) account and link those credentials under Devin → Team ▸ Integrations ▸ Jira.

From your personal account, you can then create a new ticket and add the devin label.

Devin posts an analysis comment with a plan outline and a “Start session?” prompt. Type “yes” to let it code or remove the label to keep the ticket human-only.

Note: Devin won’t auto-move cards across your board. You or your PM must still drag them to In Progress or Done. That keeps workflow control in human hands.

Letting Devin Work From Jira Tickets

Once Slack and Jira were wired, I tried a true “agent-as-teammate” experiment and threw real tickets at Devin to see whether it could implement them without hand-holding.

The workflow I used

Here’s my workflow:

Create a ticket in my newly created JIRA project and write a clear acceptance criterion.
Add the devin label, which is Devin’s cue to analyse.
Devin comments with a step-by-step plan, a confidence estimate, and asks, “Start session?”

I replied " Yes”. The ticket shows “Session started,” and gives the web IDE link. When the PR lands, Devin posts “Merged ✅” back to Slack, and I move the card on the board. None of this costs ACUs until I answer “Yes.”

Five real tickets, five wildly different outcomes

Here’s a summary of what happened for five real tickets:

Ticket	Planned work	ACUs	How Devin Actually Performed
Migrate SQLite→Postgres	Swap DB engine, run migrations	0.6	Flawless. One commit, tests stayed green.
Improve Sandpack UI and fix failing tests	UI tweaks + test reliability	5.0 (two sessions)	Insisted on switching back to SQLite, missed migration scripts, burned ACUs. I finally split UI vs tests into two prompts to finish under cap.
Show completion ticks in lists	Add ✓ badges in exercise list	0.8	One-shot success; even added optimistic UI.
Validate discovery system	End-to-end check that every file is parsed	2.6	Backend checks passed, but frontend error remained. Needed two nudges.
Remove abandoned achievement code	Delete feature flag + stale components	0.4	Devin warned “Low confidence,” then surgically removed 30 files and updated imports without a glitch.

Things that felt random

These are the things that would have needed improvement:

PR titles: I specified a naming pattern in every prompt, but Devin invented a new format each time.
Database loyalty: On one ticket, it migrated to Postgres, and on another, it silently re-introduced SQLite.
ACU estimates: The analysis comment claimed the sandpack ticket would take 1.5 ACUs, in reality it was two sessions and 5 ACUs.
The confidence vs execution: A ticket with low confidence was executed in 3 minutes, and perfectly. One with high confidence took 45 minutes of fiddling.

Devin on Jira is promising: two tickets closed perfectly, one with light nudging, and even the worst case only cost time, not a rollback. But consistency isn’t there yet, so tight scoping and explicit constraints are your friends.

Add Automated Tests With Jest And Playwright

With chat and tickets flowing, the next step was to make sure broken code can’t sneak through. I asked Devin for two things: backend unit tests and end-to-end Playwright tests that mimic a learner editing an exercise in the browser.

Backend unit tests: surprisingly painless

I asked Devin for Jest test suites covering the GraphQL resolver, service layer, and Prisma models. When I asked for an ACU estimation, it replied 20 ACUs!!

I figured that must have been an error and launched the task anyway. It cost 1.1 ACUs, and it was easily the best-executed task so far.

Playwright e2e: red wall, green wall

This one was slightly more expensive and cost 2.3 ACUs.

The flow recorded: open /learn/option-01 → edit code → wait for ✓ → refresh page → ✓ persists.

In the first run, about 70 % of assertions failed. There were many resize glitches, stale dashboard counts, and even the happy path flaked.

Despite the “Ignore failing tests, we’ll fix later” command in my prompt, Devin kept patching code until the suite turned mostly green (useful, but not what I asked).

We still have some failing tests because we have quite a few bugs in the system. But that’s okay, we’ll sort things out later to make sure all these tests are green.

Add a One-Click GitHub Actions Pipeline

With unit and end-to-end tests in place, the last step was to make sure every pull request runs those checks automatically. I asked Devin for a bare-bones workflow, with no artefacts, no coverage gates, just lint → type-check → tests.

Devin delivered a surprisingly polished pipeline in one shot, with no follow-up nudges required:

Zero config drift: Devin reused existing npm scripts, so no new tooling to learn.
Parallel everything: Lint, type-check, and the two test suites run side-by-side, so the entire workflow finishes in ~4 min on GitHub’s free runners.
Clear triage: If ESLint fails but tests pass, the summary job still reports the lint error; you never merge “partially red” code.

Devin pushed the workflow, waited for the check to complete in GitHub, and only then decided it was done. I must say, 0.4 ACU for a fully working pipeline is hard to beat. YAML is clearly Devin’s happy place.

With this workflow merged, every PR must pass lint, compile, and both test suites before anyone presses the green button!

Devin’s In-Product Wiki

Devin ships with a built-in “Wiki” that can live next to your code. It is a lightweight, auto-generated knowledge base that the agent can both read from and write to while it works. After connecting Slack, Jira, and CI, this Wiki is a good spot for architectural notes. It is worth a look!

To my knowledge, this isn’t manually editable, and you must rely on Devin to keep the Wiki updated.

Cost & Time Snapshot And Reflections

Once all the integrations, tests, and pipeline were live, I tallied the bill and the clock:

Work chunk	ACUs	Hands-on time	Notes
Slack & Jira hookup	0.0	10 min	Manual OAuth clicks; no agent time.
5 Jira tickets	9.4	2h nudge-and-review	Two tickets were okay, one needed nudges and retries and tests, one stalled on SQLite swap.
Jest unit suite (API)	1.1	5 min review	Devin’s “20 ACU” scare turned into a 1-ACU gem.
Playwright e2e suite (web)	2.3	10 min review	Devin ignored “don’t fix code,” patched until 3 tests left red.
GitHub Actions pipeline	0.4	3 min tweak	One-page YAML; green first try.
Total	13.2 ACUs	≈ 2h 30 min	≈ $30 at the Core price tier.

So, about 2 hours of human effort to get tickets, tests, and CI pushed. It is faster than I would have done it for sure.

At this point, though, I am conflicted. There are still a lot of bugs, and if I’d written the whole codebase myself, I could probably fix issues faster than Devin burns credits.

But Devin wrote most of the app, so the agent actually “knows” the structure better than I do. Still, it struggles to replace hardcoded values with dynamic ones, it leaves loose ends in the code in every file, and it does need me to sit in front of my laptop to babysit all of its actions.

I also found the process a little frustrating and not in the same way that I would be when chasing a bug I can’t grasp. There is a world of difference (to me, at least) between being frustrated that the code doesn’t work and being frustrated that an AI agent can’t follow some basic instructions. The latter is downright infuriating.

I think Devin can be very helpful when used well, but as with every AI agent out there, it can’t be a replacement for a software engineer. It is okay to use it for some tasks, but I don’t think it is very suitable or sustainable to use it for every ticket.

What’s Next?

We’re hooked up to Slack and Jira, tests are green, and the CI gate blocks every sloppy PR. But for a real production launch, we are still missing four pillars:

Authentication: wire up NextAuth credentials → JWT cookies → GqlAuthGuard in NestJS so progress is tied to real users.
Security hardening: Sentry error-tracking in both web and API runtimes.
Continuous deployment: A Vercel (web) pipeline that spins up preview URLs on every branch and promotes to prod when the main branch turns green.

That’s the agenda for Part 4, the final stretch where we find out whether Devin can secure, deploy, and babysit the app with almost no human intervention. Pin your database to Postgres (again!), cap those ACUs, and I’ll see you in the last chapter.

If you’re ready to continue, click on the last list item below to go to the fourth tutorial:

Setup and First Pull Request (Part 1)
Shipping a Vertical Slice with Devin (Part 2)
Integrations, Testing, and CI/CD (Part 3)
Security, Deployment, Maintenance (Part 4)

Author

Marie Fayard

Topics

AI Agents

Artificial Intelligence

Build AI agents with these courses:

Course

Designing Agentic Systems with LangChain

3 hr

8.8K

Get to grips with the foundational components of LangChain agents and build custom chat agents.

See Details

Start Course

Course

Introduction to AI Agents

1 hr 30 min

40K

Learn the fundamentals of AI agents, their components, and real-world use—no coding required.

See Details

Start Course

Course

Multi-Agent Systems with LangGraph

2 hr 45 min

3.5K

Build powerful multi-agent systems by applying emerging agentic design patterns in the LangGraph framework.

See Details

Start Course

blog

CI/CD in Data Engineering: A Guide for Seamless Deployment

Learn how CI/CD is employed in Data Engineering workflows and the tools that are used to build these processes.

Jake Roach

12 min

Tutorial

Software Development With Devin: Security, Deployment, Maintenance (Part 4)

Learn how to use Devin to add real user auth with NextAuth, monitor errors with Sentry, and deploy the frontend to Vercel with preview URLs.

Marie Fayard

Tutorial

Software Development With Devin: Setup And First Pull Request (Part 1)

Discover how Devin can assist in your coding tasks. In this first tutorial, we’ll get started on an existing repo and explore the features available in the Devin environment.

Marie Fayard

Tutorial

Software Development With Devin: Shipping a Vertical Slice (Part 2)

Learn how Devin can help you build a Next.js 14 playground with live Vitest feedback and a NestJS backend that saves user progress using an anonymous UUID in localStorage.

Marie Fayard

Tutorial

Azure DevOps Tutorial: Build, Test, and Deploy Applications

This tutorial walks you through Azure DevOps, making CI/CD easier than ever.

Emmanuel Akor

Tutorial

Google Jules: A Guide With 3 Practical Examples

Learn what Google Jules is and how to use it to automate real-world development tasks for your GitHub repository.

Aashi Dutt

See More See More

Set Up a Slack Integration on Devin

Set Up a Jira Integration on Devin

Letting Devin Work From Jira Tickets

The workflow I used

Five real tickets, five wildly different outcomes

Things that felt random

Add Automated Tests With Jest And Playwright

Backend unit tests: surprisingly painless

Playwright e2e: red wall, green wall

Add a One-Click GitHub Actions Pipeline

Devin’s In-Product Wiki

Cost & Time Snapshot And Reflections

What’s Next?

CI/CD in Data Engineering: A Guide for Seamless Deployment

Software Development With Devin: Security, Deployment, Maintenance (Part 4)

Software Development With Devin: Setup And First Pull Request (Part 1)

Software Development With Devin: Shipping a Vertical Slice (Part 2)

Azure DevOps Tutorial: Build, Test, and Deploy Applications

Google Jules: A Guide With 3 Practical Examples

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}Designing Agentic Systems with LangChain

Introduction to AI Agents

Multi-Agent Systems with LangGraph

CI/CD in Data Engineering: A Guide for Seamless Deployment

Software Development With Devin: Security, Deployment, Maintenance (Part 4)

Software Development With Devin: Setup And First Pull Request (Part 1)

Software Development With Devin: Shipping a Vertical Slice (Part 2)

Azure DevOps Tutorial: Build, Test, and Deploy Applications

Google Jules: A Guide With 3 Practical Examples

Designing Agentic Systems with LangChain