Job Search Automation Engine
Built in 2 days. 1,300+ LinkedIn role checks per day across 4 sources. Daily review time cut from 90 minutes to 15.
title: Job Search Automation Engine tagline: Built in 2 days. 1,300+ LinkedIn role checks per day across 4 sources. Daily review time cut from 90 minutes to 15. year: 2026 status: Active order: 2 stack:
- Python 3
- requests
- openpyxl
- beautifulsoup4
- macOS crontab highlights:
- 4 active sources (LinkedIn, RemoteOK, We Work Remotely, USAJobs)
- 13 LinkedIn searches × 2 runs/day = up to 1,300 role checks daily
- Weighted scoring model (1–10) applied at ingestion
- 569+ roles catalogued; 595 historical applications imported
Problem
In December 2025, following a layoff from Kaiser Permanente, an active job search began across multiple platforms — LinkedIn, Indeed, Dice, ZipRecruiter, USAJobs. The process was immediately fragmented: no single view of open roles, manual copy-paste into spreadsheets, duplicate listings, no scoring, time spent on low-relevance roles.
The friction was not a market problem. It was a design problem.
Strategic objective
Design and deploy an autonomous daily job aggregation engine that:
- Fetches new roles twice daily from multiple sources without manual effort
- Eliminates duplicates across platforms automatically
- Scores each role against a custom relevance profile (target companies, seniority, industry, tools)
- Filters out irrelevant roles — wrong location, wrong title type, clearance-required — at ingestion
- Surfaces the day's highest-priority applications in a clean Excel dashboard
- Requires zero manual data entry for sourced roles
Source strategy
| Source | Segment | Method | |---|---|---| | LinkedIn | Corporate, healthcare, federal | Guest API (no login) | | RemoteOK | Remote-first startups & tech | Public JSON API | | We Work Remotely | Remote management & ops | Public RSS feed | | USAJobs | Federal government IT roles | REST API |
Indeed, ZipRecruiter, and Dice all returned HTTP 403 within 24 hours of testing. Replaced with RemoteOK and We Work Remotely — pipeline modularity made the swap trivial.
Relevance scoring model
Each role is scored 1–10 at ingestion using a weighted keyword model applied to title, company, and location.
| Signal Category | Weight | Examples | |---|---|---| | Target payer company match | +3 | Humana, UnitedHealth, CareFirst, ElevanceHealth | | IT / consulting company match | +2 / +1 | Inovalon, Guidehouse, Deloitte | | Core role keywords | +2 | PMO, program management, portfolio | | Target industry keywords | +2 | Healthcare, health plan, payer, managed care | | Tools & methods | +1 / +2 | ServiceNow, Power BI, SAFe, governance | | Location match (MD/DC/VA/remote) | +1 | Maryland, Bethesda, Remote | | Wrong industry / seniority mismatch | -2 to -4 | Construction, retail, junior, coordinator | | Active security clearance required | Auto-excluded | TS/SCI required, active Top Secret | | Clearance sponsorship available | Retained as "Can Obtain" | Ability to obtain, clearance upon hire |
LinkedIn search architecture
LinkedIn's guest API rate-limits aggressively when company names appear in keywords. The plan was revised to use 13 role-type searches (no company names in keywords) with payer companies surfaced via the scoring model instead.
- 8 primary searches: PMO, Director, AI Program Manager, Healthcare IT, ServiceNow, VP PMO, Digital Transformation, Health Plan
- 5 supplemental searches: Senior Project Manager IT, Health Plan Program Director, Managed Care IT, Portfolio Governance Director, Program Director Federal Healthcare
- All searches scoped to Senior IC, Manager, Director levels
- Each search fetches 2 pages (50 results) — twice daily
Tracker architecture
Six-tab Excel workbook:
- Job Tracker — all sourced and manually added roles (19 columns)
- Applied — Pre Apr 28 — 595 historical applications imported, color-coded by status
- Dashboard — pipeline summary
- Search Log — timestamped record of every run
- How To Use — reference guide
- Manual Check — 17 target companies not covered by APIs
Live working file lives in ~/Documents/; auto-mirrored to OneDrive post-run via shutil.copy2 to avoid OneDrive sync conflicts overwriting script output.
Quantified value
Time saved: 60–90 min/day → 15–20 min/day. Roughly 5 hours/week redirected to applications and interview prep.
Key metrics (as of May 5, 2026):
- 569+ roles in active tracker
- 595 historical applications imported
- 4 sources running daily
- 13 LinkedIn searches active
- Up to 1,300 LinkedIn role checks per day
- Largest single run: 276 new roles
- Build time: 2 days
- Ongoing maintenance: ~15 min/week
Key insight
The system's resilience came from its modularity. Each source is an independent function. Swapping Indeed for RemoteOK required changing two functions and a config block — the rest of the pipeline was untouched. This is the design principle that made rapid iteration possible — and it's the same principle that makes a portfolio governance operating model survive contact with reality.