Can a Rabbit help catching Bugs?

AI-assisted code reviews: are they actually helpful, or just more noise in an already crowded inbox?

Today I want to talk about my experience with AI-assisted code review tools like CodeRabbit and how they are helping me with my day-to-day work.

As a preface: I’m not involved with any of the tools I’ll mention; these are just the ones I’ve explored, and your experience will likely vary.

This piece is going to be extremely subjective. Code reviews are a crucial part of developer life, but they highly depend on the personalities involved and the context of the project. What works for my flow might not work for yours.

What is a code review?

When working in teams, code reviews are a necessary step to avoid biased decisions. A colleague with a fresh eye reads your changes, pokes holes in the logic, and ensures everything aligns with standards. It’s an effective way to catch bugs early, standardize code, and share knowledge.

But I have a confession: I both love and hate code reviews.

I’m not naturally great at reviewing. I lack that “pentesting” mindset, the ability to find hidden holes or think outside the box. Often, I just skim; if it looks clean, it looks fine. I also struggle with the fear of bothering a peer by asking too many questions or pointing out nitpicks.

On the flip side, having someone review my code is my ultimate safety net. It guides me to explain things better and ensures I don’t introduce stupid errors. The issue is that in 2026, we have more code and less time. Sneaky bugs pass through even when a human checks the code. Logic bugs, which require knowing the whole repository, are difficult to address. With the rise of AI-generated code, this becomes even scarier: we tend to trust AI results that might contain incredibly subtle imperfections.

But AI can also help us in amazing ways…

The Lvl 0 Filter

I’m mentioning CodeRabbit because it has become my favorite tool in our CI/CD pipeline. There are several competitors like Qodo, CodeAnt, and CodeFactor and they mostly share the same idea: an AI agent that acts as a reviewer, leaving comments and suggestions every time you open a Pull Request.

The flow is straightforward: push changes, and the bot analyzes them. This DOES NOT replace humans. These tools still miss the “real” architectural bugs. But they do an excellent job on the “first line” review: catching bad practices immediately and telling you what is wrong without fear of social repercussions, every day, at every hour.

The Experience: Signal vs. Noise

After a trial period, we rolled out CodeRabbit for our Python team, and the feedback has been very positive. As I said, this is subjective. Some teammates didn’t like the verbosity of the bot or the recurring hallucinations. For example, it once left 25 comments on a PR insisting that Pydantic v2 wasn’t enabled when it actually was.

In my opinion, I’m fine with the occasional mistake. Even if it’s wrong a lot, that 10% where it catches a terribly sneaky bug is a massive win. Currently, our ratio of useful comments is about 10:1. It has caught me doing stupid things or warned me about edge cases I didn’t know about, saving me hundreds of hours of future headaches.

If a comment is wrong, you clarify it to the bot and move on. It still challenges you to think, “Am I right, or is it?” which forces a manual review of that specific step. It also has a form of “long-term memory.” I discovered that telling it, “Put in your long-term memory that we use Pydantic v2,” actually works! It remembers preferences over time, reducing the “noise” as the bot learns our stack.

For most nitpicks, it provides an “AI edit” prompt to be copied in your AI IDE of choice, something I find myself using often when I’m feeling lazy. You can also run CodeRabbit as a VSCode extension or in CLI, but I prefer having it publicly available in GitLab. It helps the “human reviewer” see how I addressed the bot’s concerns, providing more context for the final human approval.

Configuration and “Long-term Memory”

CodeRabbit is highly configurable. You can tweak verbosity, connect it to Jira or Slack for extra context, and set specific rules for its tone of voice.

One feature I love is its ability to run small snippets of commands (like grep or sed) to poke holes in the codebase. I once mentioned in a comment that a function was used twice, and the bot actually ran a script to count the frequency, correcting me on the exact number. It also analyzes huge files that a human would usually skip, multiple times it found hidden bugs in large test-result files that I would never have caught manually.

It also has a form of “long-term memory.” While not explicitly configurable via a dashboard, I discovered that telling it, “Put in your long-term memory that we use Pydantic v2,” actually works! It remembers preferences and specific instructions over time, which reduced the number of wrong statements as the bot “learned” our specific stack.

Final Thoughts

Is it perfect? No. The hallucinations can be distracting, the tool is closed-source, and the internal memory is a bit of a black box. But it feels like having a new, incredibly patient developer on the team.

It’s especially helpful for juniors because it has infinite patience when showing them what to fix. And for seniors: when I’m using libraries I don’t know well, it helps me learn best practices on the fly. Before the human reviewer even steps in, I can address the minutiae with the AI, making the whole process leaner and minimizing wasted time.

If you’re curious, most of these tools offer free trials. Swap some licenses around and see if it adds value to your team. To me, it’s a net positive: I hope it will provide value in your workflow too!

all image credits: Tom Smykowski and CodeRabbit

Can a Rabbit help catching Bugs?

What is a code review?

The Lvl 0 Filter

The Experience: Signal vs. Noise

Configuration and “Long-term Memory”

Final Thoughts

Can a Rabbit help catching Bugs?

How Evals Help Build Reliable AI

On AI Slop