MCP Server · Open Source

MCP Doc Processor

A Model Context Protocol server that reads, analyzes, creates, and edits PDF, DOCX, Excel, and PowerPoint files — with vision OCR, Document DNA, and formatting-preserving edits — through 17 focused tools.

MIT licensed Built by LeanZero Node · JavaScript

View on GitHub Try the hosted demo

Read & Analyze Any Document

Extract text, understand structure, and answer questions about PDF, DOCX, and Excel files — including scanned documents via vision-based OCR.

PDF Processing

Native text extraction for digital PDFs. Vision-based OCR for scanned documents with AI post-processing that fixes broken words, spacing, and character errors.

DOCX, Excel & PowerPoint

Rich text extraction from Word documents preserving formatting and embedded images. Multi-sheet Excel parsing, plus per-slide text and speaker notes from PowerPoint decks.

Focused Analysis

Don't just dump text — ask specific questions. The focused mode lets agents interrogate documents for exactly the information they need.

Create & Edit Documents

Generate professional Word documents, PDFs, PowerPoint decks, Markdown, and Excel spreadsheets directly from AI conversations. Edit existing files without losing formatting.

DOCX, PDF, PPTX & Markdown

Pass the whole body as one markdown string
Titles, 3 heading levels, lists, and tables
Headers and footers with automatic page numbers
Inline code shading and fenced code blocks
PDF via headless Chromium; editable .pptx decks — one slide per '## ' heading

Excel Generation

Multiple sheets with structured data
Column widths, row heights, and font styling
Formatted headers with background colors
Style presets with optimized Excel colors
Append rows, add sheets, or replace data

DOCX XML Patching

Unlike naive approaches that rebuild documents from scratch, the edit-doc tool works directly with the XML inside the DOCX ZIP structure. Edits preserve all original formatting — headers, footers, images, custom styles, and complex layouts stay intact when appending or replacing content.

17 MCP Tools

Every capability is a Model Context Protocol tool any MCP-compatible client can call.

read-doc

Read & analyze a PDF, DOCX, Excel, or PowerPoint file — summary, in-depth, or focused (query-based) modes. For a .pptx it returns per-slide text and speaker notes. Local path or a remote HTTPS URL.

detect-format

Recommend the right format (markdown / docx / excel) and tone before you create a document.

create-doc

Create a styled Word DOCX with headings, tables, headers, footers, page numbers, and 8 presets.

create-markdown

Create a Markdown file for technical docs, READMEs, and code-heavy content.

create-excel

Create an XLSX workbook with multiple sheets, styled headers, and column/row formatting.

create-pdf

Create a styled PDF from markdown — headings, lists, tables, code, headers/footers with page numbers, and the same 8 presets. Rendered via headless Chromium.

create-pptx

Create an editable PowerPoint (.pptx) deck from markdown — one slide per '## ' heading, with bullets, native slide tables, native charts, speaker notes, and the same 8 style presets. Opens in PowerPoint, Keynote, or Google Slides.

edit-doc

Append / replace / restyle DOCX via XML patching — preserves original formatting, headers, and images.

edit-excel

Append rows, add sheets, or replace data in an existing workbook while keeping styles.

edit-pptx

Append or replace slides in an existing .pptx — preview, append-slides, or replace-slide (rebuilt from the deck's text + notes).

fact-check

Verify a document's claims against the LIVE web — a cross-MCP tool that calls the web-search MCP per claim, gathers cited sources, and can write a verification report.

list-documents

Search the document registry by category, tags, and title to find or de-duplicate documents.

list-templates

List built-in templates and learned blueprints you can apply to new documents.

dna

Manage Document DNA — project-wide header/footer/style defaults, memories, and usage-driven evolution.

blueprint

Learn, list, and delete structural blueprints extracted from real documents.

drift-monitor

Watch a document's fingerprint and check for structural drift over time.

get-lineage

Trace a document's provenance — which sources informed it and what derived from it.

Where You Can Use It

It speaks standard MCP, so it drops into virtually any AI client — and into Atlassian Forge apps through the Anthropic MCP connector.

Claude Code

Add the hosted server in one command with --transport http, or point it at your self-hosted stdio build.

LM Studio

Drop it into mcp.json — remote (url + headers) for the demo, or stdio (command + args + env) when self-hosting.

Claude Desktop

Add it to claude_desktop_config.json. Bridge the hosted URL with mcp-remote if you're on a stdio-only build.

Cursor / Cline / Roo

Any MCP-compatible client works — same mcp.json shape, remote or local.

Atlassian Forge apps

CogniRunner reaches it through the Anthropic MCP connector to read and write Jira attachments.

Get Set Up

Most tools need no API key at all. Only OCR of scanned PDFs needs a Z.AI vision key — and you bring your own.

Hosted demo (fastest)

Demo only

Grab a demo key, then add the remote server to your client:

Claude Code

claude mcp add --transport http doc-processor https://worksmacstudio.tailfc4700.ts.net:10000/mcp \
  --header "Authorization: Bearer YOUR_DEMO_KEY"

mcp.json (LM Studio / Cursor / Claude Desktop)

{
  "mcpServers": {
    "doc-processor": {
      "url": "https://worksmacstudio.tailfc4700.ts.net:10000/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_DEMO_KEY",
        "X-ZAI-Key": "YOUR_ZAI_KEY (optional, scanned-PDF OCR)",
        "X-Output-Dir": "my-folder (optional — a folder ON THE SERVER; for files on THIS machine, self-host below)"
      }
    }
  }
}

Self-host (recommended)

Clone, install — then point any client at your local stdio build. Your machine, your files, your limits.

terminal

git clone https://github.com/leanzero-srl/leanzero-mcp-doc-processor
cd leanzero-mcp-doc-processor
npm install

mcp.json (local / stdio)

{
  "mcpServers": {
    "doc-processor": {
      "command": "node",
      "args": ["/ABSOLUTE/PATH/TO/leanzero-mcp-doc-processor/src/index.js"],
      "env": {
        "DOC_OUTPUT_DIR": "/ABSOLUTE/PATH/where/files/should/be/created",
        "Z_AI_API_KEY": "YOUR_ZAI_KEY (optional, for OCR)"
      }
    }
  }
}

Getting the file onto your machine. On the hosted server every create-doc / create-pdf / create-excel response comes back with a signed download link (valid ~24h) and an MCP resource_link — click it, or have your agent fetch it, and the real file lands on your machine. No filesystem access needed.

Prefer files written directly to disk? Self-host over stdio and they're written straight into your agent's workspace (or wherever you set DOC_OUTPUT_DIR). On the hosted server you can also pass an X-Output-Dir header to organize output in a folder on the server. (A remote server can't write to your disk directly — that's why the download link exists.)

You bring the keys — never us. The server holds no global vision key. Most tools (create, edit, read digital PDF/DOCX/ Excel) need no key at all. To OCR an image-based (scanned) PDF, pass your own Z.AI key over HTTP via X-ZAI-Key (or ?zai_key), or set Z_AI_API_KEY in the env block when self-hosting on stdio.

Hosted demo key

Try it in seconds

Enter your email and we'll mint a demo bearer for the hosted server and email you the setup steps. It's a convenience for evaluation — self-host for anything real.

8 Professional Style Presets

Each preset defines a complete typographic system — fonts, sizes, heading levels, spacing, justification, and table styling. Documents can also be auto-styled by category.

Claude-like

Calibri

Modern blue-accented default

11pt

Professional

Garamond

Executive summaries, formal reports

11pt

Technical

Arial

API docs, specs, user manuals

11pt

Legal

Times New Roman

Contracts, agreements, briefs

12pt

Business

Calibri

Proposals, go-to-market plans

11pt

Minimal

Arial

Everyday, clean documents

11pt

Casual

Verdana

Internal comms, team updates

12pt

Colorful

Arial

Presentations, marketing

12pt

Document DNA & Intelligence

A built-in guidance system that learns your preferences, prevents duplicates, and automatically styles documents based on project identity.

Project DNA

A .document-dna.json config stores your company name, default style, headers, and footers. Every document inherits your brand automatically via three-level inheritance: system, project, and user.

Category Classification

Documents are auto-categorized by keywords into contracts, technical, business, legal, meeting, or research — each mapping to the right preset and folder.

Document Registry

A persistent registry tracks every document for de-duplication and discovery. Thread-safe with atomic file locking for concurrent access.

Under the Hood

Built with the official MCP SDK in pure JavaScript. Hosted over Tailscale Funnel with per-tenant auth and bring-your-own keys.

MCP Protocol

Built on @modelcontextprotocol/sdk. Works with Claude Code, LM Studio, Cline, Roo, and any MCP-compatible client — stdio or Streamable HTTP.

Tenant auth + provisioning

Per-tenant argon2 bearer tokens, rate limiting, and a secret-gated provisioning endpoint behind a public Tailscale Funnel.

Vision OCR

Z.AI GLM vision for scanned-PDF OCR with AI post-processing — keyless by default, each caller brings their own key per request.

XML Patching

DOCX edits work at the XML level inside the ZIP structure, preserving all original formatting during modifications.

115-140ms

Text-based PDF processing

17 tools

MCP server interface

BYO key

Per-request vision OCR

How it works

From a plain request to a properly formatted file — the agent does four things, and most of it is automatic.

Plan the format

detect-format reads the user's intent and returns the best format — Markdown, DOCX, Excel, PDF, or PowerPoint — with a ready plan (which tool, which style preset, which category). 'Send to the client & print' → PDF; 'edit later in Word' → DOCX; 'budget tracker' → Excel; 'README' → Markdown; 'pitch deck' → PowerPoint.

Author in plain markdown

You pass the whole body as a single markdown string. The server renders real structure: headings, lists, tables, code blocks, live Excel formulas (=SUM…), and clickable Tables of Contents — never a wall of plain text.

Style & place it

Document DNA and 8 presets style it consistently; the file is written to your own workspace when self-hosted (or pushed to your endpoint when hosted). One content string in, a polished document out.

Self-correct & learn

The response flags formatting gaps and format mismatches (a table-heavy 'doc' → use Excel), and for memory-capable agents suggests a reusable memory. Every call is logged so the tool keeps improving.

How it's different from other document MCPs

Most "write a file" MCP servers hand back raw text. This one behaves like a document team that just works.

Real documents, not text dumps

Most MCPs: Most file MCPs emit raw text or markdown and call it a .docx.

This one: Genuinely formatted DOCX, PDF, PPTX, XLSX & MD — 8 style presets, vision OCR, live Excel formulas, editable slide decks, and clickable PDF/Markdown tables of contents.

Semantic format routing

Most MCPs: Other MCPs make you pick the right tool yourself.

This one: It infers DOCX vs PDF vs Excel vs Markdown from the user's intent — and if the content fits another format better, it says so (formatSuggestion).

Your machine, your files

Most MCPs: Hosted MCPs strand created files on their own server.

This one: Output is rooted at the caller's workspace (DOC_OUTPUT_DIR / the folder your agent runs in) when self-hosted — files land where you work.

Bring-your-own keys

Most MCPs: Many bake in a vendor key and meter you.

This one: The server holds no global vision key — OCR uses your key, per request. No lock-in, no surprise bills.

It learns in your hands

Most MCPs: Static MCPs never improve from how you use them.

This one: Every call is logged for the maintainer, and memory-capable coding agents get nudged to remember the tool+preset that worked — so it gets sharper over time.

Document DNA

Most MCPs: Re-specify headers, footers, and styling on every doc.

This one: Project-wide defaults (header/footer/preset) apply automatically and even evolve from your usage patterns.

Fact-check against the live web

Most MCPs: Document MCPs can't verify what they read.

This one: fact-check is a cross-MCP function — it calls our web-search MCP per claim to gather cited sources, then can write a verification report. Two MCPs, one workflow.

FAQ

Is it really free and open source?+

Yes — MIT licensed, full source on GitHub. Built and maintained by LeanZero.

Do I need an API key?+

For most tools, no. Creating, editing, and reading digital PDF/DOCX/Excel needs no key. Only OCR of image-based (scanned) PDFs needs a Z.AI vision key, which you bring yourself — the server holds no global key.

What's the difference between the demo and self-hosting?+

The hosted demo is a shared, rate-limited convenience to try it fast. Self-hosting gives you full control, your own limits, and local file access — the right choice for real work.

Which clients work with it?+

Any MCP client: Claude Code, Claude Desktop, LM Studio, Cursor, Cline, Roo, and more. Use the remote (url + headers) shape for the hosted server, or stdio (command + args + env) when self-hosting.

Can I use it from an Atlassian Forge app?+

Yes. CogniRunner registers it with the Anthropic MCP connector (authorization_token), so Jira agents can read and generate attachments through the hosted endpoint.

Open Source & Free

MIT licensed. Self-host it or grab a demo key, wire it into your AI client, and start reading, creating, and editing documents in minutes.

View on GitHub Join the Community

MCP Server · Open Source

MCP Doc Processor

MIT licensed Built by LeanZero Node · JavaScript

View on GitHub Try the hosted demo

Read & Analyze Any Document

Extract text, understand structure, and answer questions about PDF, DOCX, and Excel files — including scanned documents via vision-based OCR.

PDF Processing

Native text extraction for digital PDFs. Vision-based OCR for scanned documents with AI post-processing that fixes broken words, spacing, and character errors.

DOCX, Excel & PowerPoint

Rich text extraction from Word documents preserving formatting and embedded images. Multi-sheet Excel parsing, plus per-slide text and speaker notes from PowerPoint decks.

Focused Analysis

Don't just dump text — ask specific questions. The focused mode lets agents interrogate documents for exactly the information they need.

Create & Edit Documents

Generate professional Word documents, PDFs, PowerPoint decks, Markdown, and Excel spreadsheets directly from AI conversations. Edit existing files without losing formatting.

DOCX, PDF, PPTX & Markdown

Pass the whole body as one markdown string
Titles, 3 heading levels, lists, and tables
Headers and footers with automatic page numbers
Inline code shading and fenced code blocks
PDF via headless Chromium; editable .pptx decks — one slide per '## ' heading

Excel Generation

Multiple sheets with structured data
Column widths, row heights, and font styling
Formatted headers with background colors
Style presets with optimized Excel colors
Append rows, add sheets, or replace data

DOCX XML Patching

17 MCP Tools

Every capability is a Model Context Protocol tool any MCP-compatible client can call.

read-doc

Read & analyze a PDF, DOCX, Excel, or PowerPoint file — summary, in-depth, or focused (query-based) modes. For a .pptx it returns per-slide text and speaker notes. Local path or a remote HTTPS URL.

detect-format

Recommend the right format (markdown / docx / excel) and tone before you create a document.

create-doc

Create a styled Word DOCX with headings, tables, headers, footers, page numbers, and 8 presets.

create-markdown

Create a Markdown file for technical docs, READMEs, and code-heavy content.

create-excel

Create an XLSX workbook with multiple sheets, styled headers, and column/row formatting.

create-pdf

Create a styled PDF from markdown — headings, lists, tables, code, headers/footers with page numbers, and the same 8 presets. Rendered via headless Chromium.

create-pptx

edit-doc

Append / replace / restyle DOCX via XML patching — preserves original formatting, headers, and images.

edit-excel

Append rows, add sheets, or replace data in an existing workbook while keeping styles.

edit-pptx

Append or replace slides in an existing .pptx — preview, append-slides, or replace-slide (rebuilt from the deck's text + notes).

fact-check

Verify a document's claims against the LIVE web — a cross-MCP tool that calls the web-search MCP per claim, gathers cited sources, and can write a verification report.

list-documents

Search the document registry by category, tags, and title to find or de-duplicate documents.

list-templates

List built-in templates and learned blueprints you can apply to new documents.

dna

Manage Document DNA — project-wide header/footer/style defaults, memories, and usage-driven evolution.

blueprint

Learn, list, and delete structural blueprints extracted from real documents.

drift-monitor

Watch a document's fingerprint and check for structural drift over time.

get-lineage

Trace a document's provenance — which sources informed it and what derived from it.

Where You Can Use It

It speaks standard MCP, so it drops into virtually any AI client — and into Atlassian Forge apps through the Anthropic MCP connector.

Claude Code

Add the hosted server in one command with --transport http, or point it at your self-hosted stdio build.

LM Studio

Drop it into mcp.json — remote (url + headers) for the demo, or stdio (command + args + env) when self-hosting.

Claude Desktop

Add it to claude_desktop_config.json. Bridge the hosted URL with mcp-remote if you're on a stdio-only build.

Cursor / Cline / Roo

Any MCP-compatible client works — same mcp.json shape, remote or local.

Atlassian Forge apps

CogniRunner reaches it through the Anthropic MCP connector to read and write Jira attachments.

Get Set Up

Most tools need no API key at all. Only OCR of scanned PDFs needs a Z.AI vision key — and you bring your own.

Hosted demo (fastest)

Demo only

Grab a demo key, then add the remote server to your client:

Claude Code

claude mcp add --transport http doc-processor https://worksmacstudio.tailfc4700.ts.net:10000/mcp \
  --header "Authorization: Bearer YOUR_DEMO_KEY"

mcp.json (LM Studio / Cursor / Claude Desktop)

{
  "mcpServers": {
    "doc-processor": {
      "url": "https://worksmacstudio.tailfc4700.ts.net:10000/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_DEMO_KEY",
        "X-ZAI-Key": "YOUR_ZAI_KEY (optional, scanned-PDF OCR)",
        "X-Output-Dir": "my-folder (optional — a folder ON THE SERVER; for files on THIS machine, self-host below)"
      }
    }
  }
}

Self-host (recommended)

Clone, install — then point any client at your local stdio build. Your machine, your files, your limits.

terminal

git clone https://github.com/leanzero-srl/leanzero-mcp-doc-processor
cd leanzero-mcp-doc-processor
npm install

mcp.json (local / stdio)

{
  "mcpServers": {
    "doc-processor": {
      "command": "node",
      "args": ["/ABSOLUTE/PATH/TO/leanzero-mcp-doc-processor/src/index.js"],
      "env": {
        "DOC_OUTPUT_DIR": "/ABSOLUTE/PATH/where/files/should/be/created",
        "Z_AI_API_KEY": "YOUR_ZAI_KEY (optional, for OCR)"
      }
    }
  }
}

Hosted demo key

Try it in seconds

Enter your email and we'll mint a demo bearer for the hosted server and email you the setup steps. It's a convenience for evaluation — self-host for anything real.

8 Professional Style Presets

Each preset defines a complete typographic system — fonts, sizes, heading levels, spacing, justification, and table styling. Documents can also be auto-styled by category.

Claude-like

Calibri

Modern blue-accented default

11pt

Professional

Garamond

Executive summaries, formal reports

11pt

Technical

Arial

API docs, specs, user manuals

11pt

Legal

Times New Roman

Contracts, agreements, briefs

12pt

Business

Calibri

Proposals, go-to-market plans

11pt

Minimal

Arial

Everyday, clean documents

11pt

Casual

Verdana

Internal comms, team updates

12pt

Colorful

Arial

Presentations, marketing

12pt

Document DNA & Intelligence

A built-in guidance system that learns your preferences, prevents duplicates, and automatically styles documents based on project identity.

Project DNA

A .document-dna.json config stores your company name, default style, headers, and footers. Every document inherits your brand automatically via three-level inheritance: system, project, and user.

Category Classification

Documents are auto-categorized by keywords into contracts, technical, business, legal, meeting, or research — each mapping to the right preset and folder.

Document Registry

A persistent registry tracks every document for de-duplication and discovery. Thread-safe with atomic file locking for concurrent access.

Under the Hood

Built with the official MCP SDK in pure JavaScript. Hosted over Tailscale Funnel with per-tenant auth and bring-your-own keys.

MCP Protocol

Built on @modelcontextprotocol/sdk. Works with Claude Code, LM Studio, Cline, Roo, and any MCP-compatible client — stdio or Streamable HTTP.

Tenant auth + provisioning

Per-tenant argon2 bearer tokens, rate limiting, and a secret-gated provisioning endpoint behind a public Tailscale Funnel.

Vision OCR

Z.AI GLM vision for scanned-PDF OCR with AI post-processing — keyless by default, each caller brings their own key per request.

XML Patching

DOCX edits work at the XML level inside the ZIP structure, preserving all original formatting during modifications.

115-140ms

Text-based PDF processing

17 tools

MCP server interface

BYO key

Per-request vision OCR

How it works

From a plain request to a properly formatted file — the agent does four things, and most of it is automatic.

Plan the format

Author in plain markdown

Style & place it

Document DNA and 8 presets style it consistently; the file is written to your own workspace when self-hosted (or pushed to your endpoint when hosted). One content string in, a polished document out.

Self-correct & learn

How it's different from other document MCPs

Most "write a file" MCP servers hand back raw text. This one behaves like a document team that just works.

Real documents, not text dumps

Most MCPs: Most file MCPs emit raw text or markdown and call it a .docx.

This one: Genuinely formatted DOCX, PDF, PPTX, XLSX & MD — 8 style presets, vision OCR, live Excel formulas, editable slide decks, and clickable PDF/Markdown tables of contents.

Semantic format routing

Most MCPs: Other MCPs make you pick the right tool yourself.

This one: It infers DOCX vs PDF vs Excel vs Markdown from the user's intent — and if the content fits another format better, it says so (formatSuggestion).

Your machine, your files

Most MCPs: Hosted MCPs strand created files on their own server.

This one: Output is rooted at the caller's workspace (DOC_OUTPUT_DIR / the folder your agent runs in) when self-hosted — files land where you work.

Bring-your-own keys

Most MCPs: Many bake in a vendor key and meter you.

This one: The server holds no global vision key — OCR uses your key, per request. No lock-in, no surprise bills.

It learns in your hands

Most MCPs: Static MCPs never improve from how you use them.

This one: Every call is logged for the maintainer, and memory-capable coding agents get nudged to remember the tool+preset that worked — so it gets sharper over time.

Document DNA

Most MCPs: Re-specify headers, footers, and styling on every doc.

This one: Project-wide defaults (header/footer/preset) apply automatically and even evolve from your usage patterns.

Fact-check against the live web

Most MCPs: Document MCPs can't verify what they read.

This one: fact-check is a cross-MCP function — it calls our web-search MCP per claim to gather cited sources, then can write a verification report. Two MCPs, one workflow.

FAQ

Is it really free and open source?+

Yes — MIT licensed, full source on GitHub. Built and maintained by LeanZero.

Do I need an API key?+

What's the difference between the demo and self-hosting?+

The hosted demo is a shared, rate-limited convenience to try it fast. Self-hosting gives you full control, your own limits, and local file access — the right choice for real work.

Which clients work with it?+

Any MCP client: Claude Code, Claude Desktop, LM Studio, Cursor, Cline, Roo, and more. Use the remote (url + headers) shape for the hosted server, or stdio (command + args + env) when self-hosting.

Can I use it from an Atlassian Forge app?+

Yes. CogniRunner registers it with the Anthropic MCP connector (authorization_token), so Jira agents can read and generate attachments through the hosted endpoint.

Open Source & Free

MIT licensed. Self-host it or grab a demo key, wire it into your AI client, and start reading, creating, and editing documents in minutes.

View on GitHub Join the Community