MARKET RESEARCH
Hugging Face dataset-card monitor to Notion research tracker
Watches the Hugging Face Hub on a schedule for newly published or updated dataset cards matching your research domain.
How it runs
The automated pipeline, trigger to output.
- TriggerDaily schedule fires
- ActionList recent datasets by domain tagsHugging Face
- LogicDrop already-tracked and incomplete cards
- ActionRead dataset card for license, size, modalityHugging Face
- OutputCreate Notion tracker rowNotion
What it does
Keeps a living Notion table of every new dataset card on the Hugging Face Hub that fits a research domain you define (for example "clinical NLP" or "satellite imagery"). Each run pulls fresh listings, filters by your keyword and modality rules, and writes a clean row per dataset so your team has one canonical, searchable backlog instead of scattered Hub bookmarks.
When to use it
Use it when a research or data team needs to stay current on relevant open datasets without anyone manually browsing the Hub. Ideal for literature-review prep, benchmark sourcing, or maintaining a curated dataset inventory.
How it works
- 1A daily schedule fires the workflow.
- 2The Hugging Face step lists datasets sorted by last-modified, filtered to your domain tags and search terms.
- 3A logic step drops anything already seen (matched against existing Notion rows) and skips cards missing a license or with zero downloads.
- 4For each new match, an action reads the dataset card to extract size, modality, and license.
- 5The final output creates a Notion database row with name, link, license, modality, and first-seen date.
Set it up
What you configure once, before turning it on.
- 1Connect Hugging FaceModels, datasets, spaces — the open-source hub.
- 2Connect NotionPages, databases, comments.
- 3Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 4Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 5Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More Market Research workflows
Enrich Inbound Accounts with BigQuery Firmographics and Score Fit
When a new account row lands in Airtable, joins it against BigQuery public business datasets to attach firmographic attributes.
Blend BigQuery TAM with Live Competitor Signals into a Notion Brief
On demand, sizes a chosen segment from BigQuery public data, gathers current competitor signals via Brave Search, and synthesizes a one-page market brief into Notion.
Allocate Sales Territory TAM from BigQuery Geo Data to HubSpot
When triggered by a webhook, queries BigQuery public ZIP-level business data to compute TAM per sales territory.
Hiring Surge Detector with Slack Alert
Detects when a target account's open-role count jumps above its recent baseline and posts a ranked Slack alert to the GTM channel so reps can act on a company that is clearly…
Tech-Stack Shift Inference from Job Descriptions
Reads new job descriptions for target accounts, uses an LLM to extract named technologies and infer stack changes.
Weekly Hiring-Intel Briefing for GTM
An agent reviews the week's accumulated hiring signals across all target accounts, writes a narrative briefing that infers each account's likely initiatives.
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
