Discover what’s trending on GitHub without relying on official APIs. This project extracts GitHub Trending repositories or developers across daily, weekly, and monthly views, helping you spot fast-rising projects and standout engineers with clean, structured output.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for github-trending-repositories-developers you've just found your team — Let’s Chat. 👆👆
This project collects trending data from GitHub’s public Trending pages for either Repositories or Developers, over a selected date range. It solves the problem of manually tracking trends across languages and filters by producing consistent datasets you can store, analyze, and automate.
It’s built for product teams, analysts, recruiters, investors, and developers who want to monitor open-source momentum and identify emerging signals early.
- Extracts either trending repositories or trending developers based on a selected category.
- Supports Daily / Weekly / Monthly trend windows for consistent comparisons.
- Adds filters for code language, spoken language, and sponsor status (developer trends).
- Normalizes results into structured objects for dashboards, alerts, and reporting.
- Designed for repeatable runs with stable output fields for analytics pipelines.
| Feature | Description |
|---|---|
| Repository trend extraction | Captures repository name, description, languages, stars, forks, builders, and stars gained over the selected date range. |
| Developer trend extraction | Captures developer profile details plus their most popular repository and description. |
| Date range modes | Run in daily, weekly, or monthly mode to match how you track trend momentum. |
| Language filtering | Filter trends by coding language to monitor adoption and ecosystem shifts. |
| Spoken language filtering | Narrow results by UI language preferences to align with regional tracking. |
| Sponsorable detection | Identify sponsor-enabled developers for creator economy and OSS funding signals. |
| Structured JSON output | Produces clean, consistent objects ready for storage, BI, or automation workflows. |
| Scalable runs | Designed to handle repeated runs reliably for monitoring and historical comparisons. |
| Field Name | Field Description |
|---|---|
| category | Trend category selected: repositories or developers. |
| dateRange | Trend window: daily, weekly, or monthly. |
| language | Coding language filter (if provided). |
| spokenLanguage | UI spoken language filter (if provided). |
| isSponsorable | When developer category is selected, indicates whether sponsor-enabled developers are targeted. |
| name | Repository full name (owner/repo) or developer username (depending on category). |
| description | Repository description (repositories category). |
| description_urls | Any extracted URLs referenced in the description text (repositories category). |
| languages | List of detected programming languages (repositories category). |
| stars | Total star count shown on the trending page (repositories category). |
| forks | Total fork count shown on the trending page (repositories category). |
| builders | List of contributor/profile URLs shown for a repository (repositories category). |
| date_range_stars | Stars gained during the selected date range window (repositories category). |
| url | Developer profile URL (developers category). |
| popular_repo | URL of the developer’s most popular repository (developers category). |
| popular_repo_description | Short description of that popular repository (developers category). |
| collectedAt | ISO timestamp when the data was collected (added for tracking & auditing). |
[
{
"name": "psf / black",
"description": "The uncompromising Python code formatter",
"description_urls": [],
"languages": [
"Python"
],
"forks": "2,466",
"stars": "39,053",
"builders": [
"https://github.com/ambv",
"https://github.com/JelleZijlstra",
"https://github.com/ichard26",
"https://github.com/hugovk",
"https://github.com/apps/dependabot"
],
"date_range_stars": "108"
},
{
"name": "hoshi-hiyouga",
"url": "https://github.com/hiyouga",
"popular_repo": "https://github.com/hiyouga/LLaMA-Factory",
"popular_repo_description": "Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)"
}
]
github-trending-repositories-developers-scraper/
├── src/
│ ├── main.py
│ ├── cli.py
│ ├── runner/
│ │ ├── run_job.py
│ │ └── validators.py
│ ├── config/
│ │ ├── settings.py
│ │ └── presets.json
│ ├── scraping/
│ │ ├── trending_repositories.py
│ │ ├── trending_developers.py
│ │ ├── parsers.py
│ │ └── selectors.py
│ ├── normalization/
│ │ ├── models.py
│ │ ├── normalizer.py
│ │ └── enrichments.py
│ ├── output/
│ │ ├── exporters.py
│ │ └── schema.json
│ └── utils/
│ ├── http.py
│ ├── timing.py
│ └── logging.py
├── tests/
│ ├── test_repositories.py
│ ├── test_developers.py
│ └── fixtures/
│ ├── repositories_sample.html
│ ├── developers_sample.html
│ └── expected_output.json
├── data/
│ ├── sample_input.json
│ └── sample_output.json
├── scripts/
│ ├── run_local.sh
│ └── export_csv.py
├── .gitignore
├── .env.example
├── pyproject.toml
├── requirements.txt
├── LICENSE
└── README.md
- Recruiters use it to identify trending developers and their standout repos, so they can source candidates aligned with fast-growing technologies.
- Product teams use it to monitor open-source trend shifts by language, so they can validate roadmap decisions with real adoption signals.
- Investors & analysts use it to track breakout repositories week-over-week, so they can spot early momentum in emerging categories like AI tooling.
- Developer advocates use it to discover new projects gaining stars quickly, so they can curate newsletters and community updates with fresh picks.
- Engineering managers use it to benchmark ecosystem movement monthly, so they can prioritize internal experiments and upskilling plans.
Select the category input as repositories to extract trending projects, or developers to extract trending contributors and their highlighted work. The output schema adapts based on the chosen category.
You can safely apply dateRange, language, and spokenLanguage filters. For developer trends, isSponsorable can be used to focus on sponsor-enabled profiles. The exporter preserves stable keys and fills missing fields predictably.
Yes. The project is designed for repeated runs and includes collectedAt timestamps plus normalized field structures, making it straightforward to store results and compare trend movement across time windows.
Trend pages can change layout over time. If parsing errors occur, update selectors in src/scraping/selectors.py and validate with fixture tests under tests/fixtures/. For stability, use retry logic and conservative timeouts in src/utils/http.py.
Primary Metric: Average extraction completes in 6–12 seconds per run for a single category with standard filters on a typical broadband connection.
Reliability Metric: 97–99% successful runs across repeated daily monitoring when using retries (2–3 attempts) and stable selectors.
Efficiency Metric: Low resource footprint — typically under 200MB RAM for a run, with most time spent waiting on page rendering and DOM hydration.
Quality Metric: 95–98% field completeness for repository entries (name/description/stars/forks/languages), with occasional gaps in builders or description URLs depending on what the trending page displays.
