Course
Welcome back! By the end of the second tutorial, we had a pastel-themed fp-ts playground, a NestJS + PostgreSQL backend, and anonymous UUID progress tracking.
You can access all the tutorials in the Devin series here:
- Setup and First Pull Request (Part 1)
- Shipping a Vertical Slice with Devin (Part 2)
- Integrations, Testing, and CI/CD (Part 3)
- Security, Deployment, Maintenance (Part 4)
What we’ve done so far is great for solo hacking, but it is time to see how well Devin integrates with team workflows. In this third tutorial, we’ll look at:
- Integrations: Devin will open Jira tickets, work on them, and broadcast every PR status straight to Slack.
- Quality gates: We’ll add Jest unit tests for the API, Playwright end-to-end flows for the UI, and enforce 90% coverage.
- CI/CD: GitHub Actions will lint, type-check, run all tests, and attach Playwright reports to pull requests before anything can be merged.
No auth and no prod deploys just yet, those will happen in Part 4!
Set Up a Slack Integration on Devin
Hooking Devin into your comms and ticket flow is entirely manual and done via the Devin interface.
You can connect to Slack from the integration tab of the Devin settings, or install the “Devin AI” app from Slack’s App Directory.
The app still shows “Not approved by Slack” in the OAuth dialog. Cognition says their security review is pending, and functionality is unaffected.
Then choose a channel:
You can chat with Devin with just a mention:
And it starts a session that you can access in the UI:
By default, you get notified of the PR updates in the channel of your choice, but there are a few different notification settings that you can tweak in each session’s parameters.
Set Up a Jira Integration on Devin
To integrate with Jira, you need to create a dedicated bot user (e.g., devin-bot@…
) account and link those credentials under Devin → Team ▸ Integrations ▸ Jira.
From your personal account, you can then create a new ticket and add the devin
label.
Devin posts an analysis comment with a plan outline and a “Start session?” prompt. Type “yes” to let it code or remove the label to keep the ticket human-only.
Note: Devin won’t auto-move cards across your board. You or your PM must still drag them to In Progress or Done. That keeps workflow control in human hands.
Letting Devin Work From Jira Tickets
Once Slack and Jira were wired, I tried a true “agent-as-teammate” experiment and threw real tickets at Devin to see whether it could implement them without hand-holding.
The workflow I used
Here’s my workflow:
- Create a ticket in my newly created JIRA project and write a clear acceptance criterion.
- Add the
devin
label, which is Devin’s cue to analyse. - Devin comments with a step-by-step plan, a confidence estimate, and asks, “Start session?”
- I replied " Yes”. The ticket shows “Session started,” and gives the web IDE link. When the PR lands, Devin posts “Merged ✅” back to Slack, and I move the card on the board. None of this costs ACUs until I answer “Yes.”
Five real tickets, five wildly different outcomes
Here’s a summary of what happened for five real tickets:
Ticket |
Planned work |
ACUs |
How Devin Actually Performed |
Migrate SQLite→Postgres |
Swap DB engine, run migrations |
0.6 |
Flawless. One commit, tests stayed green. |
Improve Sandpack UI and fix failing tests |
UI tweaks + test reliability |
5.0 (two sessions) |
Insisted on switching back to SQLite, missed migration scripts, burned ACUs. I finally split UI vs tests into two prompts to finish under cap. |
Show completion ticks in lists |
Add ✓ badges in exercise list |
0.8 |
One-shot success; even added optimistic UI. |
Validate discovery system |
End-to-end check that every file is parsed |
2.6 |
Backend checks passed, but frontend error remained. Needed two nudges. |
Remove abandoned achievement code |
Delete feature flag + stale components |
0.4 |
Devin warned “Low confidence,” then surgically removed 30 files and updated imports without a glitch. |
Things that felt random
These are the things that would have needed improvement:
- PR titles: I specified a naming pattern in every prompt, but Devin invented a new format each time.
- Database loyalty: On one ticket, it migrated to Postgres, and on another, it silently re-introduced SQLite.
- ACU estimates: The analysis comment claimed the sandpack ticket would take 1.5 ACUs, in reality it was two sessions and 5 ACUs.
- The confidence vs execution: A ticket with low confidence was executed in 3 minutes, and perfectly. One with high confidence took 45 minutes of fiddling.
Devin on Jira is promising: two tickets closed perfectly, one with light nudging, and even the worst case only cost time, not a rollback. But consistency isn’t there yet, so tight scoping and explicit constraints are your friends.
Add Automated Tests With Jest And Playwright
With chat and tickets flowing, the next step was to make sure broken code can’t sneak through. I asked Devin for two things: backend unit tests and end-to-end Playwright tests that mimic a learner editing an exercise in the browser.
Backend unit tests: surprisingly painless
I asked Devin for Jest test suites covering the GraphQL resolver, service layer, and Prisma models. When I asked for an ACU estimation, it replied 20 ACUs!!
I figured that must have been an error and launched the task anyway. It cost 1.1 ACUs, and it was easily the best-executed task so far.
Playwright e2e: red wall, green wall
This one was slightly more expensive and cost 2.3 ACUs.
The flow recorded: open /learn/option-01
→ edit code → wait for ✓ → refresh page → ✓ persists.
In the first run, about 70 % of assertions failed. There were many resize glitches, stale dashboard counts, and even the happy path flaked.
Despite the “Ignore failing tests, we’ll fix later” command in my prompt, Devin kept patching code until the suite turned mostly green (useful, but not what I asked).
We still have some failing tests because we have quite a few bugs in the system. But that’s okay, we’ll sort things out later to make sure all these tests are green.
Add a One-Click GitHub Actions Pipeline
With unit and end-to-end tests in place, the last step was to make sure every pull request runs those checks automatically. I asked Devin for a bare-bones workflow, with no artefacts, no coverage gates, just lint → type-check → tests.
Devin delivered a surprisingly polished pipeline in one shot, with no follow-up nudges required:
- Zero config drift: Devin reused existing npm scripts, so no new tooling to learn.
- Parallel everything: Lint, type-check, and the two test suites run side-by-side, so the entire workflow finishes in ~4 min on GitHub’s free runners.
- Clear triage: If ESLint fails but tests pass, the summary job still reports the lint error; you never merge “partially red” code.
Devin pushed the workflow, waited for the check to complete in GitHub, and only then decided it was done. I must say, 0.4 ACU for a fully working pipeline is hard to beat. YAML is clearly Devin’s happy place.
With this workflow merged, every PR must pass lint, compile, and both test suites before anyone presses the green button!
Devin’s In-Product Wiki
Devin ships with a built-in “Wiki” that can live next to your code. It is a lightweight, auto-generated knowledge base that the agent can both read from and write to while it works. After connecting Slack, Jira, and CI, this Wiki is a good spot for architectural notes. It is worth a look!
To my knowledge, this isn’t manually editable, and you must rely on Devin to keep the Wiki updated.
Cost & Time Snapshot And Reflections
Once all the integrations, tests, and pipeline were live, I tallied the bill and the clock:
Work chunk |
ACUs |
Hands-on time |
Notes |
Slack & Jira hookup |
0.0 |
10 min |
Manual OAuth clicks; no agent time. |
5 Jira tickets |
9.4 |
2h nudge-and-review |
Two tickets were okay, one needed nudges and retries and tests, one stalled on SQLite swap. |
Jest unit suite (API) |
1.1 |
5 min review |
Devin’s “20 ACU” scare turned into a 1-ACU gem. |
Playwright e2e suite (web) |
2.3 |
10 min review |
Devin ignored “don’t fix code,” patched until 3 tests left red. |
GitHub Actions pipeline |
0.4 |
3 min tweak |
One-page YAML; green first try. |
Total |
13.2 ACUs |
≈ 2h 30 min |
≈ $30 at the Core price tier. |
So, about 2 hours of human effort to get tickets, tests, and CI pushed. It is faster than I would have done it for sure.
At this point, though, I am conflicted. There are still a lot of bugs, and if I’d written the whole codebase myself, I could probably fix issues faster than Devin burns credits.
But Devin wrote most of the app, so the agent actually “knows” the structure better than I do. Still, it struggles to replace hardcoded values with dynamic ones, it leaves loose ends in the code in every file, and it does need me to sit in front of my laptop to babysit all of its actions.
I also found the process a little frustrating and not in the same way that I would be when chasing a bug I can’t grasp. There is a world of difference (to me, at least) between being frustrated that the code doesn’t work and being frustrated that an AI agent can’t follow some basic instructions. The latter is downright infuriating.
I think Devin can be very helpful when used well, but as with every AI agent out there, it can’t be a replacement for a software engineer. It is okay to use it for some tasks, but I don’t think it is very suitable or sustainable to use it for every ticket.
What’s Next?
We’re hooked up to Slack and Jira, tests are green, and the CI gate blocks every sloppy PR. But for a real production launch, we are still missing four pillars:
- Authentication: wire up NextAuth credentials → JWT cookies →
GqlAuthGuard
in NestJS so progress is tied to real users. - Security hardening: Sentry error-tracking in both web and API runtimes.
- Continuous deployment: A Vercel (web) pipeline that spins up preview URLs on every branch and promotes to
prod
when the main branch turns green.
That’s the agenda for Part 4, the final stretch where we find out whether Devin can secure, deploy, and babysit the app with almost no human intervention. Pin your database to Postgres (again!), cap those ACUs, and I’ll see you in the last chapter.
If you’re ready to continue, click on the last list item below to go to the fourth tutorial:
- Setup and First Pull Request (Part 1)
- Shipping a Vertical Slice with Devin (Part 2)
- Integrations, Testing, and CI/CD (Part 3)
- Security, Deployment, Maintenance (Part 4)

I am a product-minded tech lead who specialises in growing early-stage startups from first prototype to product-market fit and beyond. I am endlessly curious about how people use technology, and I love working closely with founders and cross-functional teams to bring bold ideas to life. When I’m not building products, I’m chasing inspiration in new corners of the world or blowing off steam at the yoga studio.