Contributing¶
Prerequisites¶
- VS Code with the Dev Containers extension
- Docker Desktop (Windows/macOS) or Docker Engine (Linux)
- SSH access to your test server
- uv — Python package manager
Dev container¶
The project ships a dev container that gives you a proper Linux environment regardless of your host OS. This is the recommended way to develop and run tests — it matches the target deployment platform and avoids Windows/macOS path and socket quirks.
First-time setup¶
- Open the project in VS Code
- When prompted "Reopen in Container", click it — or open the Command Palette and run Dev Containers: Reopen in Container
- VS Code rebuilds the container and installs all dependencies via
uv sync
That's it. The integrated terminal is now running inside a Linux container with Python, uv, ruff, and pyright all available.
What the container includes¶
| Tool | Purpose |
|---|---|
| Python 3.11 (Debian Bookworm) | Matches the minimum supported version |
| uv | Dependency management and running tools |
| Ruff | Linting and formatting |
| Pylance | Type checking in the editor |
| docker-outside-of-docker | Allows docker commands inside the container to reach the host Docker daemon |
Running tests¶
All test commands should be run from inside the dev container (VS Code integrated terminal).
Unit tests¶
Or via Make:
The suite currently has ~330 tests. Expected output: all pass, 16 skipped (E2E tests that require a live runtime, and one Linux-path test that is skipped on macOS).
Linting and formatting¶
Or individually:
Pre-PR checklist¶
Before opening a pull request, run all three in order:
make fmt # 1. Format and lint — fix all errors
make typecheck # 2. Type check — fix all errors
make test # 3. Full test suite — fix all failures
All three must pass clean. Then run /review in Claude Code to check the implementation against the spec.
Manual end-to-end testing¶
Unit tests cover logic in isolation. Manual E2E testing verifies that tools work correctly against a real container runtime with real services.
Never test against the production instance. Use the branch test deployment workflow below.
How it works¶
scripts/test-deploy.sh SSHes to your server, checks out your branch to a separate directory (~/arr-mcp-test/), starts a minimal throwaway stack, and runs arr-mcp on port 8082. Your production instance on port 8081 is never touched.
your machine server (192.168.2.15)
──────────────────── ──────────────────────────────────
make test-deploy ──SSH──▶ ~/arr-mcp-test/ (your branch)
BRANCH=feat/foo arr-mcp running on :8082
└── test-stack/ (throwaway containers)
.mcp.json.test ◀── swap in ── http://192.168.2.15:8082/mcp
Step-by-step¶
1. Deploy your branch to the test instance
The script will print the test instance URL and API key when it's ready:
arr-mcp test instance is running.
Dashboard : http://192.168.2.15:8082/
MCP URL : http://192.168.2.15:8082/mcp
API key : test-key-local
Logs : ssh 192.168.2.15 tail -f /tmp/arr-mcp-test.log
2. Point Claude at the test instance
Swap .mcp.json.test in as your active .mcp.json:
# Back up production config
cp .mcp.json .mcp.json.prod
# Switch to test instance
cp .mcp.json.test .mcp.json
Reload the MCP connection in your Claude client (FleetView: disconnect and reconnect the arr-mcp server).
3. Run your manual tests
The test instance has four throwaway containers running — test-sabnzbd, test-sonarr, test-radarr, and test-plex. Use these to exercise the MCP tools.
Container and filesystem tools
container_list/container_stats/container_logs— verify all four containers appear and report datacontainer_start/container_stop/container_restart— verify lifecycle actions against a throwaway container (never run these against production containers)service_scan/service_diagnose/service_health_report— verify diagnostic tools run against test service configsservice_api_reachability/inter_service_reachability— verify API and cross-service connectivity checks against the test stack's Sonarr/Radarr/Plex/SABnzbdfile_read/file_write/file_delete/directory_list/disk_usage— verify filesystem access is scoped to the test-stack data directorylog_read/log_search— verify log tools against test service log dirscompose_to_quadlets/quadlets_to_compose— verify conversion againsttest-stack/compose.yaml- Dashboard at
http://192.168.2.15:8082/— verify it loads and shows the test containers
Credential and service-client tools
credential_set/credential_list/credential_delete— store and retrieve API keys forsonarr,radarr, andplexagainst the test instances (never store production credentials here)- Verify
CredentialStore's three-tier resolution (env var → stored credential → XML config auto-discovery) by testing with and without an explicitly stored credential
Media intelligence tools (Sonarr/Radarr/Plex)
These need realistic data to exercise meaningfully — register a series in test-Sonarr and a movie in test-Radarr (via their /api/v3/series and /api/v3/movie endpoints), seed placeholder media files, add matching libraries in test-Plex, and mark items watched via Plex's /:/scrobble endpoint. Then:
watched_cleanup_preview— verify it identifies non-current-season episodes that have files on disk and that all household Plex users have watched (season 0 is always excluded)watched_cleanup_delete— verify it deletes only the previewed candidates and requiresconfirm=true
Seeding realistic data
The cross-reference logic needs Plex titles to match Sonarr/Radarr titles, files registered as hasFile in Sonarr/Radarr, and a season beyond the one under test to be monitored (so Sonarr doesn't treat the seeded season as "current"). Building this seed data is fiddly enough that it's worth scripting if you do it more than once.
4. Check test instance logs if anything looks wrong
5. Restore production config and tear down the test instance
Reload the MCP connection in your Claude client to reconnect to production.
Then choose how much to clean up:
# Stop arr-mcp and bring containers down — leaves ~/arr-mcp-test intact for next run
bash scripts/test-deploy.sh --stop
# Full teardown — removes containers, volumes, logs, and ~/arr-mcp-test entirely
bash scripts/test-deploy.sh --clean
Use --stop when you plan to test the same branch again shortly. Use --clean when you're done and want nothing left on the server.
Test environment reference¶
| Command | What it does |
|---|---|
bash scripts/test-deploy.sh BRANCH=foo |
Deploy branch, start test stack + arr-mcp on :8082 |
bash scripts/test-deploy.sh --stop |
Kill arr-mcp, bring containers down |
bash scripts/test-deploy.sh --clean |
Full teardown — removes everything from the server |
No make on Windows
If you're running from a Windows terminal, use bash scripts/test-deploy.sh directly. make targets work from inside the dev container or a Linux/macOS terminal.
Test stack¶
The test stack is defined in test-stack/compose.yaml. It runs four lightweight services:
| Container | Image | Port |
|---|---|---|
test-sabnzbd |
linuxserver/sabnzbd |
18080 |
test-sonarr |
linuxserver/sonarr |
18989 |
test-radarr |
linuxserver/radarr |
17878 |
test-plex |
linuxserver/plex |
33400 |
These use non-standard ports so they never conflict with production. Config and data land in test-stack/data/ on the server, which is excluded from git.
Environment variables (test instance)¶
The deploy script generates .env.test on the server automatically. The values it uses:
| Variable | Value | Notes |
|---|---|---|
ARR_MCP_PORT |
8082 |
Separate from production on 8081 |
ARR_MCP_API_KEY |
test-key-local |
Matches .mcp.json.test |
ARR_MCP_SERVICES_DIR |
~/arr-mcp-test/test-stack/data |
Points at test service configs |
ARR_MCP_COMPOSE_DIR |
~/arr-mcp-test/test-stack |
Points at test stack |
ARR_MCP_DASHBOARD_PUBLIC |
true |
No key required for test dashboard |
To override any of these, set the corresponding env var before running the deploy script:
Commit guidelines¶
- Use Conventional Commits:
feat:,fix:,chore:,docs:,test: - Reference the GitHub issue:
git commit --trailer "Github-Issue:#<number>" - One branch per issue — name it
feat/issue-N-short-descriptionorfix/issue-N-short-description - Never commit directly to
main— all changes go through a PR - Run the pre-PR checklist before opening a PR