MIT Researchers Are Using n8n to Automate Scientific Literature Reviews. Academia Has Not Caught Up Yet.

Someone linked me to the MIT preprint in March, not because they thought I’d find the research interesting, but because one of the figures in the paper was a screenshot of an n8n workflow. A proper one, with HTTP Request nodes and a Code node and a Split In Batches node handling the pagination, and I recognised it the way you recognise a neighbourhood you’ve lived in before. The specific shape of how someone has solved the arXiv API rate limiting problem is the same shape I arrived at eight months earlier for a completely different client with completely different research goals.

They were doing what I’d been doing since early last year: pulling papers from academic APIs, passing abstracts to an LLM for relevance scoring, filtering by a threshold in an IF node, writing the survivors to a structured output. The difference was they’d written four thousand words about the methodology. I’d written a three-paragraph README that nobody read.

The pipeline itself is not complicated once you’ve done it. The hard part, which the paper described in two sentences and glossed the rest, is that Semantic Scholar’s bulk API and its standard search API have different rate limits, different response structures, and different authentication requirements, and the n8n documentation for the HTTP Request node doesn’t distinguish between them because it can’t, it’s a general-purpose node. What the docs show is a generic request with headers and a response body. What you actually get from Semantic Scholar at scale is a nested response where the papers are three levels deep in a field called data, inside a field that might be called elements depending on which endpoint you hit, and the first time you try to access it with an expression like {{ $json.papers }} you get undefined and spend forty minutes reading API docs before you find the correct path.

The fix is a Set node that extracts the nested field before anything else touches it. Not complicated. Not documented anywhere for this specific case. The kind of thing that lives as institutional knowledge in the automation community and doesn’t exist in academic papers, because academic papers describe what worked, not the forty minutes before it worked.

This is the gap that academia hasn’t caught up to, and it is not a technical gap. The technical part, pulling papers and scoring them and filtering them, is not hard. n8n can do it with a handful of nodes, and it does it better than LangChain for this specific use case, because LangChain’s document loaders and retrieval chains are designed for a different problem and you spend more time fighting the abstraction than using it. I built a literature screening workflow using LangChain’s document loader for a client who asked for it specifically, and I spent three days getting the chunking and embedding pipeline right for something that ended up being a more complicated version of what I could have done with an HTTP Request node, a Code node with twenty lines of JavaScript, and a Supabase node writing to a vector column. The LangChain version was more impressive to explain in a meeting. The n8n version worked at three in the morning without me.

Image credit: Screenshot from “Build AI Agents with n8n | Complete Beginner’s Automation Course 2026” by JavaScript Mastery on YouTube (https://www.youtube.com/watch?v=UtXzdmpysmU).

What MIT found, and what is going to be true for more research groups as this spreads, is that n8n handles the operational layer that academic software almost never handles correctly: retries on failed API calls, error branching when a paper’s abstract comes back empty or malformed, execution logging that shows exactly which paper caused a problem and why, and the ability to pause a run at two thousand papers and restart it without rebuilding state from scratch. None of that appears in the methodology section. All of it determines whether the pipeline runs reliably for six months or gets abandoned after the third production failure.

CrewAI has come up in a few conversations about this kind of pipeline, usually from researchers who saw a demo and thought the multi-agent framing mapped onto their review process. It doesn’t. Not yet. The demos show agents coordinating cleanly. The production reality is agents that get into argument loops over classification decisions, burn through API credits on meta-reasoning about the task instead of doing it, and fail in ways that are genuinely hard to debug because the failure is in the reasoning chain rather than in a node output you can inspect. I’ve tried to build client-facing pipelines with CrewAI twice. Both times I ended up replacing the agent coordination layer with a straightforward n8n IF node and a clear conditional path, and the result was faster, cheaper, and easier to explain when something went wrong.

The MIT preprint is good work. The literature review pipeline they built would have taken me a few hours to replicate in n8n, and that is not a criticism of their research, it is a statement about how far practical automation tooling has come in the last two years relative to how slowly academic methodologies adopt it. The research community is still describing what LangChain can theoretically do at roughly the pace the automation community is figuring out what n8n can reliably do in production.

Those two things are moving at different speeds. The gap between them is where the interesting work is going to happen.

Olaitan Oladipo

Olaitan Oladipo holds a BSc in Sociology from Olabisi Onabanjo University. He is a self-taught automation builder who has spent years inside n8n doing the work that most tutorials skip: debugging OAuth errors at 2am, migrating client automations from Make.com mid-project, fighting reverse proxy misconfigurations on AWS EC2, and figuring out through trial and error what actually holds up in production versus what only looks clean in a demo.

He is not a developer by training and not a SaaS founder. He is the person in the Discord server who actually answers the question instead of linking to the docs.

His writing on n8n Automation Tutorial covers self-hosting, AI agent workflows, tool comparisons, and the security vulnerabilities the automation industry would rather not discuss. He has built AI-assisted invoice approval flows using OpenAI function calling, connected Claude via HTTP Request nodes, and holds considered opinions about Zapier, Make.com, LangChain, and CrewAI that their marketing teams would not appreciate.

He writes for people who are technical enough to follow a tutorial but experienced enough to want the honest version.

MIT Researchers Are Using n8n to Automate Scientific Literature Reviews. Academia Has Not Caught Up Yet.

Zapier’s Three Quiet May Updates Are the Ones Operations Teams Have Actually Been Waiting For.

Make Just Built a Map of Your Entire Automation Stack. The Map Reveals More Than Make Intended.

Zapier Built AI Safety Checks Into Every Workflow. The Rest of Us Have Been Running Without Them.

MIT Researchers Are Using n8n to Automate Scientific Literature Reviews. Academia Has Not Caught Up Yet.

Related Posts

Zapier’s Three Quiet May Updates Are the Ones Operations Teams Have Actually Been Waiting For.

Make Just Built a Map of Your Entire Automation Stack. The Map Reveals More Than Make Intended.

Zapier Built AI Safety Checks Into Every Workflow. The Rest of Us Have Been Running Without Them.