GitHub Issue: #1 — Pulling from Substack is broken Date: 2026-02-09
The GitHub Actions workflow fetch-substack.yml runs hourly to fetch posts from https://johndamask.substack.com/feed and update _data/substack-posts.yml. Since approximately 2025-11-25 (the date of the last successful auto-update commit), Substack has been returning HTTP 403 Forbidden to every request made from GitHub Actions runners.
The workflow still reports success because the Python script (scripts/fetch_substack.py) catches the 403, prints an error to stdout, and exits with code 0. The “Commit and push” step then sees no changes and skips. The site continues to display the 20 stale posts that were last fetched on 2025-11-25.
Evidence from workflow logs:
Fetching from https://johndamask.substack.com/feed...
Error fetching Substack feed: 403
No posts fetched. Keeping existing file if it exists.
Substack blocks HTTP requests originating from GitHub Actions runner IP ranges. This is a known issue affecting CI/CD pipelines across multiple platforms (GitHub Actions, Netlify build runners, etc.).
Key findings:
python-requests, browser-like)The requests.get() call in fetch_substack.py (line 31) sends a bare request with no special headers, but even adding a browser-like User-Agent would not help since the block is IP-based.
The codebase already uses a CloudFlare Worker pattern for the audio transcriber tool (openai-proxy.jbdamask.workers.dev). A similar lightweight worker can proxy the Substack RSS feed request, since CloudFlare Worker egress IPs are not blocked by Substack.
Changes required:
A minimal worker that:
GET /substack-feed)https://johndamask.substack.com/feed server-sidescripts/fetch_substack.pySUBSTACK_FEED URL from the direct Substack URL to the CloudFlare Worker URLscripts/fetch_substack.py error handlingfetch-substack.yml requires no modifications| Approach | Pros | Cons |
|---|---|---|
| CloudFlare Worker proxy (recommended) | Already a pattern in this codebase; free tier; reliable; fast | Requires CloudFlare account setup (already exists) |
| Run fetch locally on a schedule | No proxy needed | Requires local machine to be running; not automated |
| Use a generic CORS/RSS proxy service | No infrastructure to manage | Third-party dependency; rate limits; reliability concerns |
| Self-hosted proxy on another cloud | Full control | Over-engineered for this use case |
| Substack API with auth token | Direct access | Substack has no official public API; fragile |
Sources: