MARKET RESEARCH
Hugging Face dataset license-gate and review router
Monitors new domain datasets on Hugging Face and routes each by license: permissive ones auto-log to a cleared Notion list.
How it runs
The automated pipeline, trigger to output.
- TriggerSchedule fires
- ActionList new domain datasetsHugging Face
- ActionRead license metadata from cardHugging Face
- LogicClassify license: permissive / restrictive / missing
- OutputLog permissive datasets to Notion approved listNotion
- OutputOpen Linear review ticket for the restLinear
What it does
Applies a compliance gate to incoming Hugging Face datasets before anyone in your org uses them. It inspects each new domain-matching dataset's license and splits the flow: commercially safe licenses land in an approved Notion list, while ambiguous, non-commercial, or unlicensed datasets become a Linear ticket so legal or a data lead can review before use.
When to use it
Use it when dataset licensing matters for shipping models commercially and you need a defensible audit trail showing which datasets were cleared versus held for review.
How it works
- 1A schedule triggers the run.
- 2The Hugging Face step lists new datasets matching your research domain.
- 3An action reads each card's license metadata.
- 4A logic branch classifies the license as permissive, restrictive, or missing.
- 5Permissive datasets output to an approved Notion list with the license tag.
- 6Restrictive or missing-license datasets output as a Linear review ticket with the card link and license details.
Set it up
What you configure once, before turning it on.
- 1Connect Hugging FaceModels, datasets, spaces — the open-source hub.
- 2Connect NotionPages, databases, comments.
- 3Connect LinearIssues, projects, cycles, triage.
- 4Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 5Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 6Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More Market Research workflows
Enrich Inbound Accounts with BigQuery Firmographics and Score Fit
When a new account row lands in Airtable, joins it against BigQuery public business datasets to attach firmographic attributes.
Blend BigQuery TAM with Live Competitor Signals into a Notion Brief
On demand, sizes a chosen segment from BigQuery public data, gathers current competitor signals via Brave Search, and synthesizes a one-page market brief into Notion.
Allocate Sales Territory TAM from BigQuery Geo Data to HubSpot
When triggered by a webhook, queries BigQuery public ZIP-level business data to compute TAM per sales territory.
Hiring Surge Detector with Slack Alert
Detects when a target account's open-role count jumps above its recent baseline and posts a ranked Slack alert to the GTM channel so reps can act on a company that is clearly…
Tech-Stack Shift Inference from Job Descriptions
Reads new job descriptions for target accounts, uses an LLM to extract named technologies and infer stack changes.
Weekly Hiring-Intel Briefing for GTM
An agent reviews the week's accumulated hiring signals across all target accounts, writes a narrative briefing that infers each account's likely initiatives.
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
