Skip to content

holo-q/git-mog

Repository files navigation

git‑mog

Find, rank, and close long‑open, high‑impact GitHub issues. git‑mog helps contributors identify issues that are old, high‑traffic, and underserved—then assemble focused worklists that are respectful of maintainers and effective for contributors.


Table of contents


Why git‑mog

Open‑source projects often carry long‑open issues that attract many users, comments, and reactions but lack a clear owner. These are excellent opportunities for contributors who want to make meaningful, visible impact without creating noise.

git‑mog provides:

  • A reproducible way to surface “spicy” issues across the entire GitHub ecosystem.
  • SQLite caching so you can refresh incrementally and work offline.
  • Objective rankings (bounty, heat, stars, age) and diversified sampling to avoid dog‑piling a single repository.
  • Ethical defaults that keep interactions constructive.

What it finds

By default, git‑mog targets issues that are:

  • Aged – open for years, not weeks.
  • Demanded – many comments/reactions; user interest is sustained.
  • Unowned – no assignee and not already linked to a closing PR.
  • Concrete – problems or small enhancements rather than meta/tracking threads.

You can further focus by language, labels (e.g., good first issue), or custom WHERE clauses.


How it works

git‑mog provides two workflows:

  • One‑shot report: issue_hunter.py → Query GitHub’s GraphQL search, score issues, print a table, write CSV/JSON.

  • Persistent cache + ranking: issue_hunter_db.py → Same search capability, plus a SQLite cache, comment heat sampling, multiple ranking modes, export, and curated sampling.

flowchart LR
  A[Search queries] -->|GraphQL| B[Fetcher]
  B --> C[(SQLite cache)]
  C --> D[Ranker]
  D --> E[Worklists / CSV / JSON]
  C --> F[Heat sampler (comments)]
  F --TTL / updatedAt--> C
Loading

No scraping. git‑mog uses GitHub’s official APIs and search qualifiers.


Install

Requires Python 3.9+.

pip install -U requests tabulate python-dateutil

Clone or copy the scripts into your repo:

git-mog/
  ├─ issue_hunter.py
  ├─ issue_hunter_db.py
  └─ README.md

(Optional) Run via pipx or inside a virtualenv.

Using uvx (no install):

uvx git-mog --help

Authentication

Create a GitHub Personal Access Token (classic is fine) with public_repo scope for public issues.

export GITHUB_TOKEN=YOUR_TOKEN   # or GH_TOKEN

Keep tokens out of shell history and CI logs. Use environment secrets for CI.

Also supports a local .env file at the repo root. Example .env:

GITHUB_TOKEN=ghp_abc123...

The CLI will auto-load .env if present.


Quick start

One‑shot report

python issue_hunter.py \
  --query 'is:issue is:open archived:false -linked:pr no:assignee created:<2021-01-01 comments:>30 reactions:>50' \
  --sample-comments 30 \
  --outfile spicy_issues

Cache + rank workflow

# 1) Fetch to SQLite and sample comment heat (default DB: ~/.git-mog/issues.sqlite)
python issue_hunter_db.py fetch \
  --query 'is:issue is:open archived:false -linked:pr no:assignee created:<2021-01-01 comments:>30 reactions:>50' \
  --query 'is:issue is:open created:<2021-01-01 comments:>30 reactions:>50 in:comments ("any update" OR "still not fixed" OR "please prioritize" OR "ETA?")' \
  --limit 300 \
  --sample-comments 30

# 2) Rank and show top 40, max 2 per repo
python issue_hunter_db.py rank --order bounty --top 40 --per-repo 2

uvx quick start

Run without installing by using uvx. The CLI auto-loads .env for GITHUB_TOKEN.

# 1) Fetch to SQLite with comment heat sampling (DB CLI)
uvx git-mog-db fetch \
  --query 'is:issue is:open archived:false -linked:pr no:assignee created:<2021-01-01 comments:>30 reactions:>50' \
  --limit 300 \
  --sample-comments 30

# 2) Rank and show a focused, diverse list
uvx git-mog-db rank --order bounty --top 40 --per-repo 1

# 3) Export results to CSV
uvx git-mog-db export --order bounty --top 200 --per-repo 2 --out out/targets.csv

# Optional: pin a version when running
uvx --from git-mog==0.1.0 git-mog-db --help

Commands

By default, the SQLite cache lives at ~/.git-mog/issues.sqlite. You can override with --db or change the base directory via MOG_HOME. In restricted environments (e.g., some sandboxes/CI), the tool may fall back to ./issues.sqlite if the home directory isn't writable.

fetch

Run one or more GitHub issue searches, page results (up to --limit per query), upsert into SQLite, and optionally sample recent comments to count “heat” phrases.

python issue_hunter_db.py fetch --db issues.sqlite \
  --query 'is:issue is:open archived:false -linked:pr no:assignee created:<2021-01-01 comments:>30 reactions:>50' \
  --limit 300 \
  --sample-comments 30 \
  --heat 'any update,still not fixed,please prioritize,ETA?' \
  --heat-ttl-days 7

Key fields in the cache:

  • Repository, issue number, title, URL
  • created_at, updated_at, age_days
  • comments, reactions, locked, assignees
  • labels, language, stars, author
  • heat_hits, last_comment_sampled_at

rank

Compute scores and print a table. Supports bounty, heat, stars, age, or custom orderings; optional per‑repo cap for diversity.

python issue_hunter_db.py rank --order bounty \
  --top 50 --per-repo 2 \
  --include-label 'good first issue' \
  --exclude-label 'tracking'

enrich-repos

Enrich repositories referenced in the cache for community‑aware ranking: forks, watchers, open issues, owner type/followers, and recent active contributors (unique authors in the last N days).

python issue_hunter_db.py enrich-repos \
  --since-days 90 --ttl-days 7 --max-repos 300 --max-commits 500

Run this before using --order community (see Ranking modes) so repo signals are available. Values are cached with TTLs to avoid rate limits.


sample

Alias of rank for creating a worklist: --k items, with --per-repo cap.

python issue_hunter_db.py sample --db issues.sqlite --order heat --k 15 --per-repo 1

export

Write the ranked set to CSV or JSON.

python issue_hunter_db.py export --db issues.sqlite \
  --order stars --top 200 --per-repo 1 \
  --out star_targets.csv

resample-comments

Recompute heat_hits for issues that changed (updatedAt) or exceeded the TTL. Supports the same --heat phrases as fetch for consistency.

python issue_hunter_db.py resample-comments --db issues.sqlite \
  --sample-comments 40 --heat-ttl-days 7 --max 200 \
  --heat 'any update,still not fixed,please prioritize,ETA?'

info

Quick database stats.

python issue_hunter_db.py info --db issues.sqlite

Ranking modes

  • bounty (default): balances age, engagement (comments + reactions), heat, lock state, stars; slight penalty if assigned.
  • heat: prioritizes heat_hits and overall engagement; suitable for “user‑pain” triage.
  • stars: emphasizes repository visibility; useful for outreach and impact.
  • age: surfaces the oldest long‑running items.
  • custom: tune your own weights:
python issue_hunter_db.py rank --db issues.sqlite --order custom \
  --weights 'heat=2.2,stars=0.9,age=0.8,activity=1.0,locked=0.4,assignee_penalty=0.8' \
  --top 50 --per-repo 1

All scores are transparent and computed from fields stored in SQLite. Adjust to match your goals.

Community‑aware ranking (experimental)

python issue_hunter_db.py enrich-repos --since-days 90 --ttl-days 7 --max-repos 300
python issue_hunter_db.py rank --order community --top 40 --per-repo 1
  • Applies small‑community bonuses, big‑repo soft penalties, and optional language bonuses (from ~/.git-mog/config.toml under [language_weights]).
  • Requires enrich-repos to populate the repos table.

Filtering & targeting

  • Language focus: --language Python --language TypeScript
  • Labels: --include-label 'good first issue' and/or --exclude-label 'tracking'
  • Advanced: --where accepts a raw SQLite predicate, e.g. --where "comments > 40 AND stars > 500"

Search query ideas (for fetch):

is:issue is:open archived:false -linked:pr no:assignee created:<2021-01-01 comments:>30 reactions:>50
is:issue is:open created:<2021-01-01 comments:>30 reactions:>50 in:comments ("any update" OR "still not fixed" OR "please prioritize" OR "ETA?")
is:issue is:open label:"good first issue" created:<2021-01-01 comments:>10 reactions:>20
is:issue is:open is:locked created:<2021-01-01 comments:>50

Recipes

Create two buckets—“Viral impact” and “Fast closes”

# Viral: heat + stars heavy
python issue_hunter_db.py rank --db issues.sqlite --order custom \
  --weights 'heat=2.5,stars=1.2,age=0.8,activity=1.0,locked=0.5,assignee_penalty=1.0' \
  --top 30 --per-repo 1

# Fast closes: older + unassigned + simple labels
python issue_hunter_db.py rank --db issues.sqlite --order custom \
  --weights 'heat=1.0,stars=0.3,age=1.4,activity=0.8,locked=0.2,assignee_penalty=1.2' \
  --include-label 'good first issue' \
  --exclude-label 'tracking' \
  --top 30 --per-repo 2

Export a weekly target list

python issue_hunter_db.py export --db issues.sqlite --order bounty --top 100 \
  --per-repo 2 --out weekly_targets.csv

Automation

Cron

# Refresh every morning, resample heat, and export
15 7 * * * cd /path/to/git-mog && \
  GITHUB_TOKEN=... \
  python issue_hunter_db.py fetch --db issues.sqlite --query 'is:issue is:open archived:false -linked:pr no:assignee created:<2021-01-01 comments:>30 reactions:>50' --limit 300 --sample-comments 30 && \
  python issue_hunter_db.py resample-comments --db issues.sqlite --sample-comments 40 && \
  python issue_hunter_db.py export --db issues.sqlite --order bounty --top 200 --per-repo 2 --out out/targets.csv

GitHub Actions (example)

name: git-mog weekly
on:
  schedule: [{cron: '15 7 * * 1'}]
  workflow_dispatch: {}
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: {python-version: '3.11'}
      - run: pip install requests tabulate python-dateutil
      - env:
          GITHUB_TOKEN: ${{ secrets.MOG_TOKEN }}
        run: |
          python issue_hunter_db.py fetch --db issues.sqlite \
            --query 'is:issue is:open archived:false -linked:pr no:assignee created:<2021-01-01 comments:>30 reactions:>50' \
            --limit 300 --sample-comments 30
          python issue_hunter_db.py export --db issues.sqlite \
            --order bounty --top 200 --per-repo 2 --out out/targets.csv
      - uses: actions/upload-artifact@v4
        with:
          name: git-mog-report
          path: out/targets.csv

Design notes

  • APIs, not scraping: Uses GitHub’s GraphQL search and issue metadata.
  • Pagination: Fetches up to --limit per query (GraphQL first:100 pages under the hood).
  • Heat phrases: Default list includes “any update”, “still not fixed”, “please prioritize”, “ETA?”, etc. You can customize via --heat.
  • TTL & change detection: Re‑samples comment heat if the issue’s updatedAt changed or the TTL elapsed.
  • Diversity: --per-repo limits items per repository after ranking to avoid dog‑piles.

Data & ethics

  • Be respectful. Offer fixes without pressure; avoid piling on maintainers.
  • No harassment. Do not use ranking to rally “demands” or target individuals.
  • Transparency. When posting a PR or comment, provide a minimal repro and a practical fix; ask what’s needed to merge (tests, docs).
  • Scope. Prefer concrete issues over meta/tracking threads.
  • Privacy. The local SQLite file contains public issue data and your computed scores; keep tokens out of it.

Troubleshooting

  • 403 / rate‑limit: The scripts automatically back off. If you hit limits often, lower --limit or run fewer queries per job.
  • 401 / auth: Ensure GITHUB_TOKEN (or GH_TOKEN) is set and valid.
  • Empty results: Start broader, then layer filters. For example, drop no:assignee or reduce comment/reaction thresholds to seed the cache.
  • SQLite locks: On shared volumes, disable concurrent runs or switch to separate DB files per job.

Roadmap

  • Packaging as a CLI (pipx install git-mog)
  • First‑class label table & exact‑match label filters
  • Repo opt‑out and blocklist support
  • Notifications (Slack/Discord) for new top‑ranked issues
  • Optional PR templates and “ready‑to‑review” checklists
  • Lightweight web UI over the SQLite cache

Contributing

Issues and PRs are welcome. Please keep contributions focused, respectful, and documented. Before submitting a PR, run your change locally against a small query to confirm the DB shape and outputs.


License

MIT. See LICENSE in this repository.


Files

  • git_mog_ops.py — ops CLI: fork, context, plan, pr (stores under .git/mog/)
  • issue_hunter_db.py — SQLite cache + ranking/sampling/export + comment heat TTL

Ready to hunt? Start with fetch, then rank with --per-repo 1 to build a calm, high‑impact worklist.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages