Nihal Naidu
← back to projects
2026·Active

Job Search Automation Engine

Built in 2 days. 1,300+ LinkedIn role checks per day across 4 sources. Daily review time cut from 90 minutes to 15.

Python 3requestsopenpyxlbeautifulsoup4macOS crontab

title: Job Search Automation Engine tagline: Built in 2 days. 1,300+ LinkedIn role checks per day across 4 sources. Daily review time cut from 90 minutes to 15. year: 2026 status: Active order: 2 stack:

  • Python 3
  • requests
  • openpyxl
  • beautifulsoup4
  • macOS crontab highlights:
  • 4 active sources (LinkedIn, RemoteOK, We Work Remotely, USAJobs)
  • 13 LinkedIn searches × 2 runs/day = up to 1,300 role checks daily
  • Weighted scoring model (1–10) applied at ingestion
  • 569+ roles catalogued; 595 historical applications imported

Problem

In December 2025, following a layoff from Kaiser Permanente, an active job search began across multiple platforms — LinkedIn, Indeed, Dice, ZipRecruiter, USAJobs. The process was immediately fragmented: no single view of open roles, manual copy-paste into spreadsheets, duplicate listings, no scoring, time spent on low-relevance roles.

The friction was not a market problem. It was a design problem.

Strategic objective

Design and deploy an autonomous daily job aggregation engine that:

  • Fetches new roles twice daily from multiple sources without manual effort
  • Eliminates duplicates across platforms automatically
  • Scores each role against a custom relevance profile (target companies, seniority, industry, tools)
  • Filters out irrelevant roles — wrong location, wrong title type, clearance-required — at ingestion
  • Surfaces the day's highest-priority applications in a clean Excel dashboard
  • Requires zero manual data entry for sourced roles

Source strategy

| Source | Segment | Method | |---|---|---| | LinkedIn | Corporate, healthcare, federal | Guest API (no login) | | RemoteOK | Remote-first startups & tech | Public JSON API | | We Work Remotely | Remote management & ops | Public RSS feed | | USAJobs | Federal government IT roles | REST API |

Indeed, ZipRecruiter, and Dice all returned HTTP 403 within 24 hours of testing. Replaced with RemoteOK and We Work Remotely — pipeline modularity made the swap trivial.

Relevance scoring model

Each role is scored 1–10 at ingestion using a weighted keyword model applied to title, company, and location.

| Signal Category | Weight | Examples | |---|---|---| | Target payer company match | +3 | Humana, UnitedHealth, CareFirst, ElevanceHealth | | IT / consulting company match | +2 / +1 | Inovalon, Guidehouse, Deloitte | | Core role keywords | +2 | PMO, program management, portfolio | | Target industry keywords | +2 | Healthcare, health plan, payer, managed care | | Tools & methods | +1 / +2 | ServiceNow, Power BI, SAFe, governance | | Location match (MD/DC/VA/remote) | +1 | Maryland, Bethesda, Remote | | Wrong industry / seniority mismatch | -2 to -4 | Construction, retail, junior, coordinator | | Active security clearance required | Auto-excluded | TS/SCI required, active Top Secret | | Clearance sponsorship available | Retained as "Can Obtain" | Ability to obtain, clearance upon hire |

LinkedIn search architecture

LinkedIn's guest API rate-limits aggressively when company names appear in keywords. The plan was revised to use 13 role-type searches (no company names in keywords) with payer companies surfaced via the scoring model instead.

  • 8 primary searches: PMO, Director, AI Program Manager, Healthcare IT, ServiceNow, VP PMO, Digital Transformation, Health Plan
  • 5 supplemental searches: Senior Project Manager IT, Health Plan Program Director, Managed Care IT, Portfolio Governance Director, Program Director Federal Healthcare
  • All searches scoped to Senior IC, Manager, Director levels
  • Each search fetches 2 pages (50 results) — twice daily

Tracker architecture

Six-tab Excel workbook:

  • Job Tracker — all sourced and manually added roles (19 columns)
  • Applied — Pre Apr 28 — 595 historical applications imported, color-coded by status
  • Dashboard — pipeline summary
  • Search Log — timestamped record of every run
  • How To Use — reference guide
  • Manual Check — 17 target companies not covered by APIs

Live working file lives in ~/Documents/; auto-mirrored to OneDrive post-run via shutil.copy2 to avoid OneDrive sync conflicts overwriting script output.

Quantified value

Time saved: 60–90 min/day → 15–20 min/day. Roughly 5 hours/week redirected to applications and interview prep.

Key metrics (as of May 5, 2026):

  • 569+ roles in active tracker
  • 595 historical applications imported
  • 4 sources running daily
  • 13 LinkedIn searches active
  • Up to 1,300 LinkedIn role checks per day
  • Largest single run: 276 new roles
  • Build time: 2 days
  • Ongoing maintenance: ~15 min/week

Key insight

The system's resilience came from its modularity. Each source is an independent function. Swapping Indeed for RemoteOK required changing two functions and a config block — the rest of the pipeline was untouched. This is the design principle that made rapid iteration possible — and it's the same principle that makes a portfolio governance operating model survive contact with reality.