MARKET RESEARCH

Paper-to-Dataset Crosswalk Brief

Finds new arXiv-style papers in your vertical via Exa, then matches each to related datasets on Hugging Face and writes a Notion brief linking each paper to the data needed…

CategoryMarket Research

Enginesim

Difficultyintermediate

Triggerschedule

Steps6

Setup~15 min

How it runs

The automated pipeline, trigger to output.

TriggerWeekly schedule
ActionFind recent papers in vertical via ExaExa
ActionExtract task and data requirementsOpenAI
ActionMatch each paper to Hugging Face datasetsHugging Face
LogicKeep papers with a strong dataset match
OutputWrite crosswalk brief to NotionNotion

What it does

Bridges the gap between what researchers are publishing and what data exists to act on it. It surfaces recent papers in your vertical using Exa's neural search, then for each notable paper searches Hugging Face for datasets that match the paper's domain and task, and assembles a Notion brief that crosswalks paper → candidate datasets → reproduction notes.

When to use it

When your team reads papers and immediately asks "can we try this?" This turns that instinct into a standing artifact: every new paper arrives pre-paired with the datasets you'd need to replicate or build on it.

How it works

1A weekly cron triggers the run.
2Exa retrieves recent high-signal papers matching the vertical's topic queries.
3An LLM extracts each paper's task, domain, and data requirements.
4Hugging Face is searched for datasets matching those requirements.
5A logic step keeps only papers with at least one strong dataset match.
6A structured crosswalk page is written to Notion, one row per paper with linked datasets and notes.

Set it up

What you configure once, before turning it on.

1
Connect ExaNeural search across the web.
2
Connect Hugging FaceModels, datasets, spaces — the open-source hub.
3
Connect OpenAIModels, embeddings, files.
4
Connect NotionPages, databases, comments.
5
Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
6
Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
7
Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

More Market Research workflows

Enrich Inbound Accounts with BigQuery Firmographics and Score Fit

When a new account row lands in Airtable, joins it against BigQuery public business datasets to attach firmographic attributes.

Blend BigQuery TAM with Live Competitor Signals into a Notion Brief

On demand, sizes a chosen segment from BigQuery public data, gathers current competitor signals via Brave Search, and synthesizes a one-page market brief into Notion.

Allocate Sales Territory TAM from BigQuery Geo Data to HubSpot

When triggered by a webhook, queries BigQuery public ZIP-level business data to compute TAM per sales territory.

Hiring Surge Detector with Slack Alert

Detects when a target account's open-role count jumps above its recent baseline and posts a ranked Slack alert to the GTM channel so reps can act on a company that is clearly…

Tech-Stack Shift Inference from Job Descriptions

Reads new job descriptions for target accounts, uses an LLM to extract named technologies and infer stack changes.

Weekly Hiring-Intel Briefing for GTM

An agent reviews the week's accumulated hiring signals across all target accounts, writes a narrative briefing that infers each account's likely initiatives.

Browse all Market Research →

Run it inside a business

This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Media

YouTube Studio

Scripts, edits, thumbnails, and scheduling — every week.

Software

AI Tools Startup

Ship an AI tool, distribute on every channel, watch the unit economics.

Software

Agent Hive runs Agent Hive

The team that built Agent Hive, exactly as it runs today.

Browse all business templates →Solutions by industry →

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.

Join the Waitlist Browse all workflows →