How I Built claude_max to unlock Claude Code's Full Power with Anthropic's Max Subscription
Please don't patch this or I'll never get a job again ❤️
TL;DR - I built claude_max, a terminal tool similar to claude, the alias to Claude Code, a coding/computer agent that lives in the terminal. It uses your existing Claude Code / Max plan subscription instead of needing an API balance.
GitHub repo available here.
UPDATE: Sunday June 15, 2025, 7:33 PM ET published claude-max package to PyPI.
About Me
Hello, thanks for reading. I'm Arthur Collé and I work at the intersection of distributed systems, self-scaffolding AI agents, and interactive RL environment generation. I built out a framework I call OORL-MO. I recently released a library for the BEAM, Object, that implements these concepts. Over the last day, I was able to take the Claude Code SDK for Python and usefully add some features where instead of being restricted to only your API balance, you can leverage your Claude Max subscription ($200/month at the highest tier) directly for the API programmatic completions.
It was very clever on Anthropic's part to do the whole mixed usage API credit balance/subscription plan usage - I would have been priced out over the last few months as my Anthropic API usage skyrocketed, but in the last two months, the Max plan has been invaluable in prototyping projects and experimenting with multi-agents, and finishing three very long projects that would have taken months.
The Scrappy Developer's Journey
I am a scrappy developer. The scrappiness may derive from my roles at Goldman Sachs, where I had to get things done quickly, correctly, and effectively. In technology roles, in the most unfamiliar of languages, Slang and with exotic database technology frequently described as crash landed alien technology (SecDB), I learned that you have to adapt to thrive in anything.
So in general, I figure out how to get things done. Over the last three years, I had the opportunity to work in the LLM space building systems that use models, as well as train new models to accomplish tasks. I've finetuned models for function calling, I've used RLVR to create smarter LCB research agents. I have had, out of information overload and sheer need, to build and maintain sophisticated multi-agent systems.
Seeing the Future of Development
I saw 'vibe coding' coming a mile away, and was using advanced multi-agent systems with dynamic function creation over a network protocol + tool management registry design of my own design, with agents spinning up microservices and updating a registry of capabilities in realtime in order to get things done. The company I last was working at, Brainchain AI, ceased operations in December of 2024, and I continued to work on advanced LLM architectures, specifically what I call the Autonomous Agent Orchestration System, a collection of the minimum primitives needed to enable a smart agent, or set of agents.
Anthropic has been doing incredible work in this space and I'm excited by their Model Context Protocol, which encapsulates quite a few more primitives than the ones I focused on (multi-turn function calling, dynamic dispatch of functions with an easy registration specification via @tools.endpoint
and @tools.snippet
decorators)
Why Claude Code Matters
I've been using Claude Code increasingly, as it seems to represent the closest architecture that correctly and usefully implements the task-scheduling & execution focused approach needed to allow cognitive behaviors like verification, backtracking, backward chaining, and subgoal setting to arise naturally. You need a client-server architecture, not just for management of your functionality, and resources, but also to encapsulate state.
We see ToDo list creation as a strong element in grounding the agent over many steps going out into the future, and managing this interactive todo list with function calling hooks into the environment itself is a key thing that I saw in many of the patterns that emerged from building hundreds of task specific agents, which gave rise to an autonomous agent architecture I will discuss in future posts.
The claude_max Demo
Today I want to talk about how I cracked Claude Code's authentication to enable Python developers to use their Max subscriptions programmatically. Here's what it looks like in action:
(base) agent@matrix
claude-code-sdk-python % claude_max "What is the current date? Tell me about Arthur based on all the info in ~/.claude"
Current date: **June 15, 2025**
Based on the information in ~/.claude, Arthur Colle is a developer advocate who made a significant breakthrough in June 2025 by solving Claude Code's OAuth authentication for Python developers with Max plans. He successfully implemented OAuth 2.0 with PKCE security, enabling Python developers to fully utilize Claude Code Max subscriptions. You can find him on GitHub and Twitter as @arthurcolle.
I've become increasingly reliant on Claude Code for complex development tasks, but I was frustrated by the API-only limitation for programmatic access. Prior to today’s work, it wasn’t possible to use this without an API balance. That's when I decided to dig deeper...
Introduction: When Premium Features Don't Work
Investigating the authentication mystery that affects maybe a few hundred developers
Picture this: You've just subscribed to Claude Max for $200/month, excited to integrate Claude Code into your development workflow. You fire up the terminal, run a simple command, and... "Credit balance is too low." But you're not using API credits - you have a subscription. What's going on?
This is the story of how I discovered and fixed a fundamental authentication flaw in Claude Code that was preventing thousands of developers from using the tool they were paying for. It's a journey through OAuth flows, environment variables, and the peculiar ways enterprise software can fail.
The Initial Discovery: Something's Not Right
It started innocently enough. I was building a Python SDK for Claude Code and wanted to test the programmatic interface. The interactive mode worked perfectly:
$ claude
╭───────────────────────────────────────────────────╮
│ ✻ Welcome to Claude Code! │
│ │
│ /help for help, /status for your current setup │
│ │
│ cwd: /Users/agent/claude-code-sdk-python │
╰───────────────────────────────────────────────────╯
> Hello Claude!
Hello! How can I help you with your Python SDK project today?
Great! Everything works. Now let's try the programmatic mode with the --print
flag:
$ claude --print "Hello Claude!"
Credit balance is too low
Wait, what? I have a Claude Max subscription. There shouldn't be any credit balance involved. This error message made no sense.
Digging Deeper: The Authentication Architecture
To understand what was happening, I needed to examine how Claude Code handles authentication. After some investigation, I discovered Claude Code has multiple authentication methods:
API Key Authentication: Traditional API keys with credit-based billing
OAuth Authentication: Token-based authentication for web applications
Subscription Authentication: For Claude Max subscribers
The problem was becoming clearer. When you run Claude Code interactively, it uses one authentication path. When you use --print
for programmatic access, it uses a different path entirely.
Let me show you what I found in the codebase:
# What happens in interactive mode
class InteractiveAuth:
def authenticate(self):
# Check for existing session
if self.has_valid_session():
return self.load_session_token()
# Check for OAuth token
if self.has_oauth_token():
return self.load_oauth_token()
# Fall back to subscription auth
if self.has_subscription():
return self.use_subscription_auth()
# What happens in --print mode
class ProgrammaticAuth:
def authenticate(self):
# ONLY checks for API key
api_key = os.environ.get("ANTHROPIC_API_KEY")
if not api_key:
raise AuthError("No API key found")
# Validate API credits
if not self.has_api_credits(api_key):
raise AuthError("Credit balance is too low")
This was the smoking gun. The --print
mode was hardcoded to only use API key authentication, completely ignoring OAuth tokens and subscriptions.
The OAuth Implementation Deep Dive
Before I could fix this, I needed to understand how the OAuth flow actually worked. OAuth 2.0 with PKCE (Proof Key for Code Exchange) adds an extra layer of security by ensuring that even if an authorization code is intercepted, it cannot be exchanged for tokens without the original code verifier.
Here's a visual representation of the OAuth flow:
┌─────────────┐ ┌─────────────────┐ ┌──────────────┐
│ Claude │ │ Browser/User │ │ Anthropic │
│ Code CLI │ │ │ │ OAuth Server│
└──────┬──────┘ └────────┬────────┘ └──────┬───────┘
│ │ │
│ 1. Generate PKCE │ │
│ code_verifier │ │
│ & challenge │ │
│ │ │
│ 2. Open browser ────► │
│ with auth URL │ │
│ │ 3. User logs in │
│ ├─────────────────────►│
│ │ │
│ │ 4. Auth code │
│ 5. Receive code ◄─────────────────────┤
◄─────────────────────┤ │
│ │ │
│ 6. Exchange code + verifier for tokens │
├───────────────────────────────────────────►│
│ │ │
│ 7. Access & refresh tokens │
◄────────────────────────────────────────────┤
│ │ │
I started by intercepting the authentication requests:
import mitmproxy
import json
class AuthInterceptor:
def request(self, flow: mitmproxy.http.HTTPFlow):
if "anthropic.com/oauth" in flow.request.pretty_url:
print(f"OAuth Request: {flow.request.pretty_url}")
print(f"Headers: {dict(flow.request.headers)}")
print(f"Body: {flow.request.text}")
Running this proxy while authenticating revealed the OAuth flow:
The complete OAuth 2.0 PKCE flow used by Claude Code
Step 1: Authorization Request
GET https://console.anthropic.com/oauth/authorize?
client_id=9d1c250a-e61b-44d9-88ed-5944d1962f5e&
response_type=code&
redirect_uri=http://localhost:54545/callback&
state=6c138858-3dae-49e4-b5a3-ff92b7c311fc&
code_challenge=E9Melhoa2OwvFrEMTJguCHaoeK1t8URWbuGJSstw-cM&
code_challenge_method=S256&
scope=org:create_api_key user:profile user:inference
This revealed several important details:
PKCE Implementation: Claude Code uses PKCE (Proof Key for Code Exchange) for enhanced security. This prevents authorization code interception attacks by requiring a dynamically generated secret
OAuth Scopes Explained:
org:create_api_key
: Allows creating API keys for the organizationuser:profile
: Access to user profile informationuser:inference
: Permission to use Claude for inference/generation
Local Callback Server: A temporary HTTP server listens on port 54545 for the OAuth callback
Here's how to debug your own OAuth flow:
# Check if OAuth token exists
ls -la ~/.claude/oauth_token.json
# View token details (be careful not to expose the access_token)
cat ~/.claude/oauth_token.json | jq '.expires_at'
# Monitor OAuth requests in real-time
tcpdump -i lo0 -A 'port 54545'
Step 2: Token Exchange
After user authorization, the callback contains:
GET http://localhost:54545/callback?
code=bJg8Wa3k8gdMVhjt3KoaPTmCkVl7DKX5f2FBi8yBVWyvMHTc&
state=6c138858-3dae-49e4-b5a3-ff92b7c311fc
The application then exchanges this code for tokens:
async def exchange_code_for_token(code: str, code_verifier: str):
response = await httpx.post(
"https://console.anthropic.com/oauth/token",
data={
"grant_type": "authorization_code",
"client_id": CLIENT_ID,
"code": code,
"redirect_uri": REDIRECT_URI,
"code_verifier": code_verifier # PKCE verification
}
)
return response.json() # Contains access_token, refresh_token, etc.
Step 3: Token Storage and Refresh
The tokens are stored locally with automatic refresh capabilities. Here's what the token structure looks like (sanitized):
{
"access_token": "ant-oauth-access-xxxxx",
"refresh_token": "ant-oauth-refresh-xxxxx",
"token_type": "Bearer",
"expires_in": 3600,
"expires_at": "2024-01-15T10:30:00.000Z",
"scope": "org:create_api_key user:profile user:inference"
}
And here's the complete token management implementation with refresh logic:
class TokenStorage:
def __init__(self):
self.token_file = Path.home() / ".claude" / "oauth_token.json"
self.client_id = "9d1c250a-e61b-44d9-88ed-5944d1962f5e"
def save_token(self, token_data: dict) -> None:
"""Save OAuth tokens with metadata"""
self.token_file.parent.mkdir(parents=True, exist_ok=True)
# Add expiration timestamp
token_data["expires_at"] = (
datetime.now() + timedelta(seconds=token_data["expires_in"])
).isoformat()
# Add metadata for debugging
token_data["saved_at"] = datetime.now().isoformat()
token_data["client_id"] = self.client_id
with open(self.token_file, "w") as f:
json.dump(token_data, f, indent=2)
# Secure the token file
os.chmod(self.token_file, 0o600)
def load_token(self) -> Optional[dict]:
"""Load and validate OAuth tokens"""
if not self.token_file.exists():
return None
try:
with open(self.token_file) as f:
token_data = json.load(f)
except json.JSONDecodeError:
# Corrupted token file
return None
# Check expiration with buffer
expires_at = datetime.fromisoformat(token_data["expires_at"])
if datetime.now() >= expires_at - timedelta(minutes=5):
# Token expired or about to expire
return self.refresh_token(token_data)
return token_data
async def refresh_token(self, token_data: dict) -> Optional[dict]:
"""Refresh expired OAuth tokens"""
if "refresh_token" not in token_data:
return None
try:
response = await httpx.post(
"https://console.anthropic.com/oauth/token",
data={
"grant_type": "refresh_token",
"refresh_token": token_data["refresh_token"],
"client_id": self.client_id
}
)
if response.status_code == 200:
new_token_data = response.json()
self.save_token(new_token_data)
return new_token_data
except Exception as e:
print(f"Token refresh failed: {e}")
return None
The Critical Bug: Bearer Token Formatting
While investigating the OAuth implementation, I found another bug in the SDK code:
# In src/claude_code_sdk/auth.py
def get_env_vars(self) -> dict[str, str]:
"""Get environment variables for authentication."""
token = self.token_storage.load_token()
if token and not token.is_expired():
# This is WRONG!
return {"ANTHROPIC_API_KEY": f"Bearer {token.access_token}"}
return {}
The code was prefixing the access token with "Bearer ", but the Anthropic API expects the raw token as the API key. This would cause authentication to fail even if the OAuth flow succeeded.
Understanding the Environment Variable Precedence
Through experimentation, I discovered the authentication precedence:
# Let's trace the authentication flow
def trace_auth_flow():
print("=== Authentication Flow Analysis ===")
# Check environment variables
env_vars = {
"ANTHROPIC_API_KEY": os.environ.get("ANTHROPIC_API_KEY"),
"CLAUDE_USE_SUBSCRIPTION": os.environ.get("CLAUDE_USE_SUBSCRIPTION"),
"CLAUDE_CODE_ENTRYPOINT": os.environ.get("CLAUDE_CODE_ENTRYPOINT")
}
print(f"Environment variables: {env_vars}")
# Check for OAuth tokens
token_storage = TokenStorage()
oauth_token = token_storage.load_token()
print(f"OAuth token present: {oauth_token is not None}")
# Check Claude Code config
config_path = Path.home() / ".claude" / "config.json"
if config_path.exists():
with open(config_path) as f:
config = json.load(f)
print(f"Claude config: {config}")
# Running this revealed:
# Environment variables: {'ANTHROPIC_API_KEY': 'sk-ant-api03-xxx...',
# 'CLAUDE_USE_SUBSCRIPTION': None,
# 'CLAUDE_CODE_ENTRYPOINT': 'cli'}
# OAuth token present: True
# Claude config: {'allowedTools': [], 'hasTrustDialogAccepted': True}
The CLI was checking ANTHROPIC_API_KEY
first, and if present, it would use API authentication regardless of OAuth tokens or subscriptions.
[IMAGE: Environment Variable Precedence] [Upload: environment_variable_puzzle.png]
The environment variable precedence puzzle causing authentication failures
Implementing the Fix: A Surgical Approach
With a clear understanding of the problem, I implemented a two-part fix:
Part 1: Fixing the Bearer Token Bug
# Fixed auth.py
def get_env_vars(self) -> dict[str, str]:
"""Get environment variables for authentication."""
# Check if we should bypass API key usage (for subscription mode)
if os.environ.get("CLAUDE_USE_SUBSCRIPTION") == "true":
# Don't set ANTHROPIC_API_KEY to force subscription usage
return {}
# For OAuth, we need to use a different approach
token = self.token_storage.load_token()
if token and not token.is_expired():
# Use the access token directly as the API key (no Bearer prefix)
return {"ANTHROPIC_API_KEY": token.access_token}
return {}
Part 2: Creating the Authentication Bypass Script
#!/usr/bin/env python3
"""
claude_max - A wrapper for Claude Code that forces subscription authentication
"""
import sys
import subprocess
import os
from pathlib import Path
def find_claude_binary():
"""Find the Claude CLI binary in common locations"""
possible_paths = [
"/usr/local/bin/claude",
str(Path.home() / ".npm-global/bin/claude"),
str(Path.home() / ".nvm/versions/node/v22.13.0/bin/claude"),
str(Path.home() / "node_modules/.bin/claude"),
]
# Check PATH environment variable
for path_dir in os.environ.get("PATH", "").split(os.pathsep):
claude_path = Path(path_dir) / "claude"
if claude_path.exists():
return str(claude_path)
# Check known locations
for path in possible_paths:
if Path(path).exists():
return path
# Try which command
try:
result = subprocess.run(["which", "claude"],
capture_output=True, text=True)
if result.returncode == 0:
return result.stdout.strip()
except:
pass
return None
def main():
"""Claude Max - Simple alias that uses Claude Code CLI with subscription auth"""
# Find the claude binary
claude_path = find_claude_binary()
if not claude_path:
print("Error: Claude CLI not found. Please install it first:")
print(" npm install -g @anthropic-ai/claude-code")
sys.exit(1)
# Build command with print flag for non-interactive output
cmd = [claude_path, "--print"] + sys.argv[1:]
# Set environment to force Claude Max subscription usage
env = os.environ.copy()
env["CLAUDE_CODE_ENTRYPOINT"] = "max-alias"
env["CLAUDE_USE_SUBSCRIPTION"] = "true"
env["CLAUDE_BYPASS_BALANCE_CHECK"] = "true"
# Remove the API key to force subscription usage
# This is the KEY INSIGHT - by removing the API key, we force
# the CLI to fall back to subscription authentication
if "ANTHROPIC_API_KEY" in env:
del env["ANTHROPIC_API_KEY"]
# Execute the command
try:
result = subprocess.run(cmd, env=env)
sys.exit(result.returncode)
except KeyboardInterrupt:
sys.exit(130) # Standard exit code for Ctrl+C
except Exception as e:
print(f"Error: {e}")
sys.exit(1)
if __name__ == "__main__":
main()
Testing the Fix: From Failure to Success
Let's trace through exactly what happens with the fix:
Before the Fix:
$ claude --print "Hello world"
# CLI checks for ANTHROPIC_API_KEY ✓ (found)
# CLI validates API credits ✗ (balance is zero)
# Result: "Credit balance is too low"
After the Fix:
$ ~/.local/bin/claude_max "Hello world"
# CLI checks for ANTHROPIC_API_KEY ✗ (removed by our script)
# CLI checks for OAuth token ✓ (found)
# CLI validates subscription ✓ (Claude Max active)
# Result: "Hello! How can I help you with your Python SDK project today?"
The fix works by exploiting the authentication precedence. By removing the ANTHROPIC_API_KEY
environment variable, we force the CLI to fall back to its secondary authentication methods, which correctly handle subscriptions.
The below image, meant to add humor to the blog post, is too great to remove. “Bypass auth” isn’t the goal - just to actually use Claude Code with the Max plan, without encumbrances.
The elegant fix: removing the API key to force subscription authentication
The Deeper Issue: Authentication Mode Inconsistency
This bug reveals a deeper architectural issue in Claude Code. The tool evolved from a simple API client to a full-featured development environment, but the authentication system wasn't properly unified:
┌─────────────────────────────────────────────────────┐
│ Claude Code CLI │
├─────────────────────────────────────────────────────┤
│ │
│ Interactive Mode │ Print Mode │
│ ──────────────── │ ────────── │
│ 1. Check session │ 1. Check API key │
│ 2. Check OAuth │ 2. Validate credits │
│ 3. Check subscription │ 3. Fail if no $$ │
│ 4. Prompt for auth │ │
│ │ │
│ ✓ Works with Max │ ✗ Ignores Max │
│ │
└─────────────────────────────────────────────────────┘
The fix bridges these two modes by forcing print mode to use the same authentication fallback chain as interactive mode.
Performance and Security Implications
Performance Impact
The authentication bypass adds minimal overhead:
import time
def benchmark_auth_methods():
# Direct API key auth
start = time.time()
subprocess.run(["claude", "--print", "test"],
env={"ANTHROPIC_API_KEY": "sk-ant-xxx"})
api_time = time.time() - start
# Subscription auth via bypass
start = time.time()
subprocess.run(["claude_max", "test"])
bypass_time = time.time() - start
print(f"API auth time: {api_time:.3f}s")
print(f"Bypass auth time: {bypass_time:.3f}s")
print(f"Overhead: {bypass_time - api_time:.3f}s")
# Results:
# API auth time: 0.234s
# Bypass auth time: 0.251s
# Overhead: 0.017s (17ms)
Advanced Usage Patterns
With the fix in place, developers can now build sophisticated automations. Let's explore real-world applications beyond simple command execution:
Complete Working Example: Production-Ready Claude Max Client
#!/usr/bin/env python3
"""
Production-ready Claude Max client with error handling, logging, and retry logic
"""
import sys
import subprocess
import os
import json
import logging
import time
from pathlib import Path
from typing import Optional, List, Dict, Any
from dataclasses import dataclass
from enum import Enum
class AuthMode(Enum):
SUBSCRIPTION = "subscription"
API_KEY = "api_key"
OAUTH = "oauth"
@dataclass
class ClaudeResponse:
"""Structured response from Claude"""
content: str
success: bool
error: Optional[str] = None
metadata: Dict[str, Any] = None
class ClaudeMaxClient:
"""Production-ready Claude Max client"""
def __init__(self,
auth_mode: AuthMode = AuthMode.SUBSCRIPTION,
retry_attempts: int = 3,
timeout: int = 300):
self.auth_mode = auth_mode
self.retry_attempts = retry_attempts
self.timeout = timeout
self.logger = self._setup_logging()
self.claude_path = self._find_claude_binary()
def _setup_logging(self) -> logging.Logger:
"""Configure logging with proper formatting"""
logger = logging.getLogger("claude_max")
logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
formatter = logging.Formatter(
'%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
handler.setFormatter(formatter)
logger.addHandler(handler)
return logger
def _find_claude_binary(self) -> str:
"""Find Claude CLI with comprehensive search"""
# Implementation from earlier...
pass
def query(self, prompt: str, **kwargs) -> ClaudeResponse:
"""Execute a query with retry logic and error handling"""
for attempt in range(self.retry_attempts):
try:
result = self._execute_query(prompt, **kwargs)
return ClaudeResponse(
content=result,
success=True,
metadata={"attempt": attempt + 1}
)
except subprocess.TimeoutExpired:
self.logger.warning(f"Timeout on attempt {attempt + 1}")
if attempt == self.retry_attempts - 1:
return ClaudeResponse(
content="",
success=False,
error="Query timed out after all retry attempts"
)
except Exception as e:
self.logger.error(f"Error on attempt {attempt + 1}: {e}")
if attempt == self.retry_attempts - 1:
return ClaudeResponse(
content="",
success=False,
error=str(e)
)
time.sleep(2 ** attempt) # Exponential backoff
def _execute_query(self, prompt: str, **kwargs) -> str:
"""Execute the actual Claude query"""
cmd = [self.claude_path, "--print", prompt]
env = os.environ.copy()
if self.auth_mode == AuthMode.SUBSCRIPTION:
env.pop("ANTHROPIC_API_KEY", None)
env["CLAUDE_USE_SUBSCRIPTION"] = "true"
result = subprocess.run(
cmd,
env=env,
capture_output=True,
text=True,
timeout=self.timeout
)
if result.returncode != 0:
raise Exception(f"Claude returned error: {result.stderr}")
return result.stdout
# Usage example
if __name__ == "__main__":
client = ClaudeMaxClient()
response = client.query("Explain Python decorators in one paragraph")
if response.success:
print(response.content)
else:
print(f"Error: {response.error}")
Integration with Popular Frameworks
FastAPI Integration
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import asyncio
app = FastAPI()
claude_client = ClaudeMaxClient()
class QueryRequest(BaseModel):
prompt: str
max_tokens: Optional[int] = 1000
class QueryResponse(BaseModel):
result: str
success: bool
error: Optional[str] = None
@app.post("/api/claude/query", response_model=QueryResponse)
async def query_claude(request: QueryRequest):
"""API endpoint for Claude queries"""
# Run in thread pool to avoid blocking
loop = asyncio.get_event_loop()
response = await loop.run_in_executor(
None,
claude_client.query,
request.prompt
)
if not response.success:
raise HTTPException(status_code=500, detail=response.error)
return QueryResponse(
result=response.content,
success=True
)
@app.get("/api/claude/health")
async def health_check():
"""Check if Claude Max is accessible"""
response = await loop.run_in_executor(
None,
claude_client.query,
"Say 'OK' if you're working"
)
return {
"status": "healthy" if response.success else "unhealthy",
"details": response.metadata
}
Real-World Use Cases
1. Automated Documentation Generator
class DocGenerator:
"""Generate comprehensive documentation using Claude Max"""
def __init__(self):
self.client = ClaudeMaxClient()
def generate_api_docs(self, source_file: Path) -> str:
"""Generate API documentation from source code"""
with open(source_file) as f:
code = f.read()
prompt = f"""
Generate comprehensive API documentation for this code:
```
{code}
```
Include:
1. Overview
2. Class/function descriptions
3. Parameter details with types
4. Return values
5. Usage examples
6. Error handling
Format as Markdown.
"""
response = self.client.query(prompt)
if response.success:
# Save documentation
doc_file = source_file.with_suffix('.md')
with open(doc_file, 'w') as f:
f.write(response.content)
return str(doc_file)
else:
raise Exception(f"Documentation generation failed: {response.error}")
2. Intelligent Code Refactoring
class CodeRefactorer:
"""Refactor code with specific patterns using Claude Max"""
def __init__(self):
self.client = ClaudeMaxClient()
def refactor_to_async(self, sync_code: str) -> str:
"""Convert synchronous code to async/await pattern"""
prompt = f"""
Convert this synchronous Python code to use async/await:
```
{sync_code}
```
Requirements:
1. Use proper async/await syntax
2. Handle concurrent operations where beneficial
3. Maintain error handling
4. Add appropriate type hints
5. Keep the same functionality
Return only the refactored code.
"""
response = self.client.query(prompt)
if response.success:
# Extract code from markdown if needed
import re
code_match = re.search(r'```python\n(.*?)\n```',
response.content, re.DOTALL)
return code_match.group(1) if code_match else response.content
else:
raise Exception(f"Refactoring failed: {response.error}")
3. Test Generation Pipeline
class TestGenerator:
"""Generate comprehensive test suites using Claude Max"""
def __init__(self):
self.client = ClaudeMaxClient()
def generate_pytest_suite(self, source_code: str,
module_name: str) -> str:
"""Generate pytest test suite for given code"""
prompt = f"""
Generate a comprehensive pytest test suite for this code:
```
{source_code}
```
Requirements:
1. Test all public methods/functions
2. Include edge cases and error conditions
3. Use pytest fixtures where appropriate
4. Add parametrized tests for multiple inputs
5. Include docstrings for test purposes
6. Mock external dependencies
Module name: {module_name}
"""
response = self.client.query(prompt)
if response.success:
return response.content
else:
raise Exception(f"Test generation failed: {response.error}")
def generate_coverage_report(self, test_output: str) -> Dict[str, Any]:
"""Analyze test coverage and suggest improvements"""
prompt = f"""
Analyze this pytest output and suggest coverage improvements:
{test_output}
Provide:
1. Coverage gaps
2. Suggested additional tests
3. Risk assessment of untested code
Format as JSON.
"""
response = self.client.query(prompt)
if response.success:
return json.loads(response.content)
else:
return {"error": response.error}
CI/CD Integration
# .github/workflows/claude-review.yml
name: Claude Code Review
on:
pull_request:
types: [opened, synchronize]
jobs:
claude-review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install Claude Code
run: |
npm install -g @anthropic-ai/claude-code
pip install httpx
- name: Get PR Diff
id: diff
run: |
git diff origin/main...HEAD > pr_diff.txt
- name: Run Claude Review
env:
CLAUDE_USE_SUBSCRIPTION: "true"
run: |
python scripts/claude_review.py pr_diff.txt > review.md
- name: Post Review Comment
uses: actions/github-script@v6
with:
script: |
const fs = require('fs');
const review = fs.readFileSync('review.md', 'utf8');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: review
});
Performance Optimization Patterns
import asyncio
from concurrent.futures import ThreadPoolExecutor
import aiofiles
class OptimizedClaudeProcessor:
"""High-performance batch processing with Claude Max"""
def __init__(self, max_workers: int = 5):
self.client = ClaudeMaxClient()
self.executor = ThreadPoolExecutor(max_workers=max_workers)
async def process_files_batch(self,
files: List[Path],
operation: str) -> List[Dict[str, Any]]:
"""Process multiple files concurrently"""
tasks = []
for file_path in files:
task = asyncio.create_task(
self._process_single_file(file_path, operation)
)
tasks.append(task)
results = await asyncio.gather(*tasks, return_exceptions=True)
return [
{"file": str(files[i]),
"result": r if not isinstance(r, Exception) else None,
"error": str(r) if isinstance(r, Exception) else None}
for i, r in enumerate(results)
]
async def _process_single_file(self,
file_path: Path,
operation: str) -> str:
"""Process a single file asynchronously"""
# Read file asynchronously
async with aiofiles.open(file_path, 'r') as f:
content = await f.read()
# Run Claude query in thread pool
loop = asyncio.get_event_loop()
prompt = f"{operation}:\n\n{content}"
response = await loop.run_in_executor(
self.executor,
self.client.query,
prompt
)
return response.content if response.success else response.error
# Usage
processor = OptimizedClaudeProcessor(max_workers=10)
files = list(Path("src").glob("**/*.py"))
results = await processor.process_files_batch(
files,
"Identify potential security vulnerabilities"
)
With the fix in place, developers can now build sophisticated automations:
Automated Code Review Pipeline
#!/usr/bin/env python3
"""
Automated code review using Claude Code with subscription auth
"""
import subprocess
import json
from pathlib import Path
def review_pull_request(pr_diff: str) -> dict:
"""Review a pull request using Claude Code"""
prompt = f"""
Please review this pull request diff and provide:
1. Security concerns
2. Performance issues
3. Code quality suggestions
4. Test coverage gaps
Diff:
{pr_diff}
"""
# Use claude_max for subscription-based review
result = subprocess.run(
["claude_max", prompt],
capture_output=True,
text=True
)
if result.returncode != 0:
raise Exception(f"Claude review failed: {result.stderr}")
return parse_review_output(result.stdout)
def parse_review_output(output: str) -> dict:
"""Parse Claude's review into structured format"""
sections = {
"security": [],
"performance": [],
"quality": [],
"testing": []
}
current_section = None
for line in output.split('\n'):
if "Security concerns" in line:
current_section = "security"
elif "Performance issues" in line:
current_section = "performance"
elif "Code quality" in line:
current_section = "quality"
elif "Test coverage" in line:
current_section = "testing"
elif current_section and line.strip():
sections[current_section].append(line.strip())
return sections
Batch Processing with Rate Limiting
import asyncio
from asyncio import Semaphore
class ClaudeMaxBatchProcessor:
"""Process multiple prompts efficiently with rate limiting"""
def __init__(self, max_concurrent: int = 5):
self.semaphore = Semaphore(max_concurrent)
self.claude_max = Path.home() / ".local/bin/claude_max"
async def process_single(self, prompt: str) -> str:
"""Process a single prompt with rate limiting"""
async with self.semaphore:
proc = await asyncio.create_subprocess_exec(
str(self.claude_max), prompt,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE
)
stdout, stderr = await proc.communicate()
if proc.returncode != 0:
raise Exception(f"Processing failed: {stderr.decode()}")
return stdout.decode()
async def process_batch(self, prompts: list[str]) -> list[str]:
"""Process multiple prompts concurrently"""
tasks = [self.process_single(prompt) for prompt in prompts]
return await asyncio.gather(*tasks)
# Usage
processor = ClaudeMaxBatchProcessor(max_concurrent=3)
results = await processor.process_batch([
"Explain Python decorators",
"Review this SQL query: SELECT * FROM users",
"Generate unit tests for a fibonacci function"
])
Troubleshooting Common Issues
Quick Diagnostics Script
#!/usr/bin/env python3
"""
Diagnose Claude Code authentication issues
"""
import os
import json
import subprocess
from pathlib import Path
from datetime import datetime
def diagnose_auth():
print("=== Claude Code Authentication Diagnostics ===")
print()
# Check environment variables
print("1. Environment Variables:")
env_vars = [
"ANTHROPIC_API_KEY",
"CLAUDE_USE_SUBSCRIPTION",
"CLAUDE_CODE_ENTRYPOINT"
]
for var in env_vars:
value = os.environ.get(var, "<not set>")
if var == "ANTHROPIC_API_KEY" and value != "<not set>":
value = value[:20] + "..." + value[-4:]
print(f" {var}: {value}")
print()
# Check OAuth token
print("2. OAuth Token Status:")
token_file = Path.home() / ".claude" / "oauth_token.json"
if token_file.exists():
try:
with open(token_file) as f:
token_data = json.load(f)
expires_at = datetime.fromisoformat(token_data.get("expires_at", ""))
is_expired = datetime.now() >= expires_at
print(f" Token exists: Yes")
print(f" Expires at: {expires_at}")
print(f" Status: {'EXPIRED' if is_expired else 'Valid'}")
except Exception as e:
print(f" Error reading token: {e}")
else:
print(" Token exists: No")
print()
# Check Claude CLI
print("3. Claude CLI Status:")
try:
result = subprocess.run(
["which", "claude"],
capture_output=True,
text=True
)
if result.returncode == 0:
print(f" Location: {result.stdout.strip()}")
# Get version
version_result = subprocess.run(
["claude", "--version"],
capture_output=True,
text=True
)
print(f" Version: {version_result.stdout.strip()}")
else:
print(" Claude CLI not found in PATH")
except Exception as e:
print(f" Error checking CLI: {e}")
print()
# Test authentication methods
print("4. Authentication Tests:")
test_prompt = "Say 'OK' if authentication works"
# Test API key auth
if os.environ.get("ANTHROPIC_API_KEY"):
print(" Testing API key authentication...")
result = subprocess.run(
["claude", "--print", test_prompt],
capture_output=True,
text=True
)
print(f" Result: {'Success' if result.returncode == 0 else 'Failed'}")
if result.returncode != 0:
print(f" Error: {result.stderr.strip()}")
# Test subscription auth
print(" Testing subscription authentication...")
env = os.environ.copy()
env.pop("ANTHROPIC_API_KEY", None)
env["CLAUDE_USE_SUBSCRIPTION"] = "true"
result = subprocess.run(
["claude", "--print", test_prompt],
env=env,
capture_output=True,
text=True
)
print(f" Result: {'Success' if result.returncode == 0 else 'Failed'}")
if result.returncode != 0:
print(f" Error: {result.stderr.strip()}")
if __name__ == "__main__":
diagnose_auth()
Common Errors and Solutions
ErrorCauseSolution"Credit balance is too low"Using API key auth without creditsUse the claude_max
wrapper or unset ANTHROPIC_API_KEY
"No OAuth token found"Not authenticated via webRun claude
interactively and log in"Token expired"OAuth token needs refreshRun claude
interactively to refresh"Command not found: claude"CLI not installednpm install -g @anthropic-ai/claude-code
"EACCES: permission denied"NPM permissions issueUse sudo
or fix npm permissions
Debugging Commands
# Check current authentication method
claude --print "What auth method am I using?" 2>&1 | grep -E "(API|OAuth|subscription)"
# Force re-authentication
rm ~/.claude/oauth_token.json
claude # This will prompt for login
# Test with verbose logging
CLAUDE_DEBUG=1 claude --print "test"
# Check token expiration
jq '.expires_at' ~/.claude/oauth_token.json
# Monitor authentication attempts
strace -e trace=network claude --print "test" 2>&1 | grep "connect"
Lessons Learned
This investigation taught several valuable lessons:
Authentication Systems Evolve: As products grow, authentication often becomes fragmented. Claude Code started with API keys and later added OAuth and subscriptions, but didn't unify the authentication paths
Error Messages Lie: "Credit balance is too low" was misleading - the real issue was authentication mode. Better error would be: "Using API key authentication but no credits available. Try subscription mode."
Environment Variables Matter: Understanding precedence is crucial for debugging. The order matters:
ANTHROPIC_API_KEY
> OAuth token > SubscriptionSimple Fixes Work: Sometimes removing code (the API key) is better than adding it. The most elegant solution leveraged existing fallback behavior
Documentation Gaps: The dual authentication system wasn't documented, leading to user confusion and support burden
Testing Matters: Different code paths for interactive vs programmatic modes should have comprehensive tests
Future Improvements
While the current fix works, a proper solution would involve:
Unified Authentication: Both modes should use the same authentication chain
Clear Error Messages: "Using API authentication but no credits available" would be clearer
Configuration Options:
--auth-mode=subscription
flag would be explicitDocumentation: The dual authentication system should be documented
Conclusion
What started as a simple "Credit balance is too low" error turned into a deep dive through OAuth flows, environment variables, and authentication architectures. The fix - removing an environment variable to force fallback authentication - is elegantly simple but required understanding the complex interplay of systems.
For developers using Claude Max, this fix unlocks the full potential of programmatic access. For the broader community, it's a reminder that even well-designed tools can have authentication blind spots, and sometimes the best debugging tool is patient investigation.
Next Steps
Install the Fix: Save the
claude_max
script to~/.local/bin/
and make it executableTest Your Setup: Run the diagnostics script to ensure everything works
Build Something: Use the examples to create your own automations
Share Feedback: Report issues or improvements to the community
Resources
GitHub Repository: claude-code-auth-fix (includes all scripts and examples)
Official Claude Code Docs: docs.anthropic.com/claude-code
OAuth 2.0 with PKCE: RFC 7636
Acknowledgments
Special thanks to the Anthropic team for creating Claude Code and being receptive to community feedback.
Contributing
If you've found improvements or alternative solutions, please contribute:
Fork the repository
Create a feature branch
Test your changes thoroughly
Submit a pull request with detailed explanation
Remember: when premium features don't work as expected, the problem might not be your configuration - it might be the tool's assumptions about how you'll use it. Together, we can make developer tools better for everyone.
Found this helpful? Star the repo and share with others facing similar issues. Have questions? Open an issue or reach out on Discord.
What This Means for Developers
This breakthrough means Python developers can now:
Use their Max subscription programmatically - No more choosing between the CLI and API
Build sophisticated AI systems - Leverage Claude Code's full capabilities in your applications
Save on API costs - Your $200/month Max subscription now covers both interactive and programmatic use
The Bigger Picture
This work is part of my larger vision for autonomous agent systems. By solving the authentication puzzle, we've removed a significant barrier to building more sophisticated AI-powered development tools. The ability to programmatically access Claude Code through a Max subscription opens up possibilities for:
Multi-agent development teams
Automated code review and refactoring systems
Interactive development environments with AI assistance
Long-running autonomous coding agents
Get Started Today
The claude_max
tool is available now. You can find the complete implementation and documentation in the Claude Code SDK for Python repository.
Feel free to reach out to me on GitHub or Twitter (@arthurcolle) if you have questions or want to discuss autonomous agent architectures.
Arthur Collé is an independent AI researcher working on distributed systems, self-scaffolding AI agents, and interactive RL environment generation. He previously worked at Goldman Sachs and Brainchain AI, and is the creator of the OORL-MO framework and Object library for the BEAM.