How to Build an AI‑Powered Literature Review Workflow with Free Open‑Source Tools

A good literature review used to feel like digging through a mountain of PDFs with a tiny flashlight. Today, a handful of open‑source tools can turn that flashlight into a searchlight, letting you see the whole landscape at once. In this post I’ll walk you through a practical, zero‑cost workflow that lets you collect, organize, and synthesize papers faster—so you can spend more time thinking, not scrolling.

Why a Structured Workflow Matters

When I was a PhD student, I spent weeks manually tagging PDFs, copying citations into a Word table, and trying to remember which article answered which research question. The result? Missed connections, duplicated effort, and a final draft that felt like a patchwork quilt. A repeatable workflow solves three problems:

Time waste – you stop re‑reading the same abstract.
Bias – you capture every relevant paper, not just the ones that pop up in a quick Google search.
Reproducibility – anyone can follow your steps and verify your sources.

All of this can be achieved with free tools that are actively maintained by the research community.

The Building Blocks

Below is the toolbox I rely on at AI Scholar Hub. Feel free to swap in alternatives that suit your taste.

Tool	What It Does	Why It’s Free
Zotero	Reference manager, PDF storage, metadata extraction	Open‑source, cross‑platform, browser plug‑in
Obsidian (core)	Plain‑text knowledge base, linking notes	Free for personal use, markdown files
Semantic Scholar API	Bulk search, citation counts, abstracts	Free public API, no auth for basic queries
ChatGPT‑like LLM (local) – e.g., LLaMA‑2‑7B via Ollama	Summarize papers, extract key ideas	Runs locally, no API cost
Python + pandas	Data cleaning, CSV handling	Standard library, open source

If you prefer a single‑pane solution, tools like JabRef or Mendeley can replace Zotero, but I find Zotero’s web‑import and tag system easiest for rapid iteration.

Step 1: Harvest Papers with the Semantic Scholar API

Start by defining a clear search query. For example, “graph neural networks for drug discovery”. Use the API to pull the first 200 results (the free tier allows this). A short Python script does the trick:

import requests, json, pandas as pd

query = "graph neural networks for drug discovery"
url = f"https://api.semanticscholar.org/graph/v1/paper/search?query={query}&limit=200&fields=title,authors,year,abstract,url"
response = requests.get(url)
data = response.json()["data"]
df = pd.DataFrame(data)
df.to_csv("search_results.csv", index=False)
print("Saved", len(df), "records")

The CSV contains titles, authors, years, abstracts, and a link to the PDF (when available). This is your raw material—no manual copy‑pasting required.

Step 2: Pull PDFs into Zotero

Zotero’s “Add Item by Identifier” feature accepts DOI or URL. To automate, use the zotero-cli tool (a small Node script) that reads the CSV and adds each entry to a dedicated collection called AI‑Lit‑Review.

npm install -g zotero-cli
zotero-cli import search_results.csv --collection "AI-Lit-Review"

Zotero will fetch metadata, download PDFs when possible, and create a tidy library. Tag each entry with the year and a short keyword (e.g., 2023, gnn). This tagging will later help you filter inside Obsidian.

Step 3: Create a Markdown Index in Obsidian

Export the Zotero library to a CSV (File → Export Library). Then run a simple script that turns each row into a markdown note:

import csv, os, pathlib

out_dir = pathlib.Path("obsidian/lit_review")
out_dir.mkdir(parents=True, exist_ok=True)

with open("zotero_export.csv") as f:
    reader = csv.DictReader(f)
    for row in reader:
        slug = row["Title"].replace(" ", "_")[:50]
        note_path = out_dir / f"{slug}.md"
        with open(note_path, "w") as note:
            note.write(f"# {row['Title']}\n")
            note.write(f"**Authors:** {row['Authors']}\n")
            note.write(f"**Year:** {row['Year']}\n")
            note.write(f"**Link:** {row['URL']}\n\n")
            note.write(f"> {row['Abstract']}\n")

Now you have a folder of plain‑text notes that Obsidian can index. The power of Obsidian lies in its backlinking: you can start a “Research Questions” note and link to any paper note simply by typing [[Paper Title]].

Step 4: Summarize with a Local LLM

Reading every abstract is still a chore. Run a local LLM to generate a one‑sentence summary for each note. With Ollama installed, the command looks like this:

for f in obsidian/lit_review/*.md; do
  content=$(cat "$f")
  summary=$(ollama run llama2:7b "Summarize the following abstract in one sentence:\n\n$content")
  echo "\n**Summary:** $summary" >> "$f"
done

Because the model runs on your own machine, there are no API fees and no data leaves your laptop—important for sensitive research topics.

Step 5: Build a Concept Map

Open the “Research Questions” note in Obsidian. Write each question as a heading, then link to the papers that address it. Use tags like #method, #dataset, #result to categorize. Obsidian’s graph view will instantly draw a network of connections, letting you spot clusters or gaps.

For example:

## How do GNNs encode molecular graphs?
- [[Paper A – Graph Convolution for Molecules]]
- [[Paper B – Message Passing Networks in Chemistry]]
- [[Paper C – Attention‑Based GNNs for Drug Discovery]]

When you hover over a link, a preview pops up showing the title, year, and the one‑sentence summary you generated earlier. This visual map replaces the endless scrolling of Google Scholar.

Step 6: Export a Bibliography for Your Manuscript

When it’s time to write, Zotero can output a citation file in BibTeX or RIS format. In the “AI‑Lit‑Review” collection, select all items, right‑click → Export → BibTeX. Drop the .bib file into your LaTeX project or reference manager. Because every paper was added through the API, you can be confident the metadata is accurate.

Tips for Keeping the Workflow Smooth

Batch updates – Run the API script every month to capture new papers. Append results to the same CSV and re‑import; Zotero will ignore duplicates.
Backup – Store your Zotero library and Obsidian vault on a cloud drive (e.g., Nextcloud) that respects open‑source values.
Stay lean – If a paper’s PDF is behind a paywall, keep the abstract and citation; you can request the full text later via interlibrary loan.
Iterate – After a few weeks, you’ll notice certain tags or note structures that work better for you. Adjust the scripts; they’re just a few lines of code.

A Personal Note

I still remember the night I stayed up until 3 am trying to locate a single missing reference for a conference paper. My desk was a sea of sticky notes, and my brain felt like it was running on fumes. After I built this workflow, the same task now takes ten minutes and a few clicks. The biggest surprise? The sense of control. When you can see all the papers laid out, you stop guessing and start planning. That’s the real power of AI in research—not a magic wand, but a set of tools that let you think more clearly.

Give this workflow a try on your next literature review. The tools are free, the steps are repeatable, and the results speak for themselves: a tidy library, a living knowledge graph, and more time for the ideas that truly matter.