A Model Context Protocol server that reads, analyzes, creates, and edits PDF, DOCX, Excel, and PowerPoint files — with vision OCR, Document DNA, and formatting-preserving edits — through 17 focused tools.
Extract text, understand structure, and answer questions about PDF, DOCX, and Excel files — including scanned documents via vision-based OCR.
Native text extraction for digital PDFs. Vision-based OCR for scanned documents with AI post-processing that fixes broken words, spacing, and character errors.
Rich text extraction from Word documents preserving formatting and embedded images. Multi-sheet Excel parsing, plus per-slide text and speaker notes from PowerPoint decks.
Don't just dump text — ask specific questions. The focused mode lets agents interrogate documents for exactly the information they need.
Generate professional Word documents, PDFs, PowerPoint decks, Markdown, and Excel spreadsheets directly from AI conversations. Edit existing files without losing formatting.
Unlike naive approaches that rebuild documents from scratch, the edit-doc tool works directly with the XML inside the DOCX ZIP structure. Edits preserve all original formatting — headers, footers, images, custom styles, and complex layouts stay intact when appending or replacing content.
Every capability is a Model Context Protocol tool any MCP-compatible client can call.
read-docRead & analyze a PDF, DOCX, Excel, or PowerPoint file — summary, in-depth, or focused (query-based) modes. For a .pptx it returns per-slide text and speaker notes. Local path or a remote HTTPS URL.
detect-formatRecommend the right format (markdown / docx / excel) and tone before you create a document.
create-docCreate a styled Word DOCX with headings, tables, headers, footers, page numbers, and 8 presets.
create-markdownCreate a Markdown file for technical docs, READMEs, and code-heavy content.
create-excelCreate an XLSX workbook with multiple sheets, styled headers, and column/row formatting.
create-pdfCreate a styled PDF from markdown — headings, lists, tables, code, headers/footers with page numbers, and the same 8 presets. Rendered via headless Chromium.
create-pptxCreate an editable PowerPoint (.pptx) deck from markdown — one slide per '## ' heading, with bullets, native slide tables, native charts, speaker notes, and the same 8 style presets. Opens in PowerPoint, Keynote, or Google Slides.
edit-docAppend / replace / restyle DOCX via XML patching — preserves original formatting, headers, and images.
edit-excelAppend rows, add sheets, or replace data in an existing workbook while keeping styles.
edit-pptxAppend or replace slides in an existing .pptx — preview, append-slides, or replace-slide (rebuilt from the deck's text + notes).
fact-checkVerify a document's claims against the LIVE web — a cross-MCP tool that calls the web-search MCP per claim, gathers cited sources, and can write a verification report.
list-documentsSearch the document registry by category, tags, and title to find or de-duplicate documents.
list-templatesList built-in templates and learned blueprints you can apply to new documents.
dnaManage Document DNA — project-wide header/footer/style defaults, memories, and usage-driven evolution.
blueprintLearn, list, and delete structural blueprints extracted from real documents.
drift-monitorWatch a document's fingerprint and check for structural drift over time.
get-lineageTrace a document's provenance — which sources informed it and what derived from it.
It speaks standard MCP, so it drops into virtually any AI client — and into Atlassian Forge apps through the Anthropic MCP connector.
Add the hosted server in one command with --transport http, or point it at your self-hosted stdio build.
Drop it into mcp.json — remote (url + headers) for the demo, or stdio (command + args + env) when self-hosting.
Add it to claude_desktop_config.json. Bridge the hosted URL with mcp-remote if you're on a stdio-only build.
Any MCP-compatible client works — same mcp.json shape, remote or local.
CogniRunner reaches it through the Anthropic MCP connector to read and write Jira attachments.
Most tools need no API key at all. Only OCR of scanned PDFs needs a Z.AI vision key — and you bring your own.
Grab a demo key, then add the remote server to your client:
claude mcp add --transport http doc-processor https://worksmacstudio.tailfc4700.ts.net:10000/mcp \ --header "Authorization: Bearer YOUR_DEMO_KEY"
{
"mcpServers": {
"doc-processor": {
"url": "https://worksmacstudio.tailfc4700.ts.net:10000/mcp",
"headers": {
"Authorization": "Bearer YOUR_DEMO_KEY",
"X-ZAI-Key": "YOUR_ZAI_KEY (optional, scanned-PDF OCR)",
"X-Output-Dir": "my-folder (optional — a folder ON THE SERVER; for files on THIS machine, self-host below)"
}
}
}
}Clone, install — then point any client at your local stdio build. Your machine, your files, your limits.
git clone https://github.com/leanzero-srl/leanzero-mcp-doc-processor cd leanzero-mcp-doc-processor npm install
{
"mcpServers": {
"doc-processor": {
"command": "node",
"args": ["/ABSOLUTE/PATH/TO/leanzero-mcp-doc-processor/src/index.js"],
"env": {
"DOC_OUTPUT_DIR": "/ABSOLUTE/PATH/where/files/should/be/created",
"Z_AI_API_KEY": "YOUR_ZAI_KEY (optional, for OCR)"
}
}
}
}create-doc / create-pdf / create-excel response comes back with a signed download link (valid ~24h) and an MCP resource_link — click it, or have your agent fetch it, and the real file lands on your machine. No filesystem access needed.DOC_OUTPUT_DIR). On the hosted server you can also pass an X-Output-Dir header to organize output in a folder on the server. (A remote server can't write to your disk directly — that's why the download link exists.)X-ZAI-Key (or ?zai_key), or set Z_AI_API_KEY in the env block when self-hosting on stdio.Enter your email and we'll mint a demo bearer for the hosted server and email you the setup steps. It's a convenience for evaluation — self-host for anything real.
Each preset defines a complete typographic system — fonts, sizes, heading levels, spacing, justification, and table styling. Documents can also be auto-styled by category.
Modern blue-accented default
11ptExecutive summaries, formal reports
11ptAPI docs, specs, user manuals
11ptContracts, agreements, briefs
12ptProposals, go-to-market plans
11ptEveryday, clean documents
11ptInternal comms, team updates
12ptPresentations, marketing
12ptA built-in guidance system that learns your preferences, prevents duplicates, and automatically styles documents based on project identity.
A .document-dna.json config stores your company name, default style, headers, and footers. Every document inherits your brand automatically via three-level inheritance: system, project, and user.
Documents are auto-categorized by keywords into contracts, technical, business, legal, meeting, or research — each mapping to the right preset and folder.
A persistent registry tracks every document for de-duplication and discovery. Thread-safe with atomic file locking for concurrent access.
Built with the official MCP SDK in pure JavaScript. Hosted over Tailscale Funnel with per-tenant auth and bring-your-own keys.
Built on @modelcontextprotocol/sdk. Works with Claude Code, LM Studio, Cline, Roo, and any MCP-compatible client — stdio or Streamable HTTP.
Per-tenant argon2 bearer tokens, rate limiting, and a secret-gated provisioning endpoint behind a public Tailscale Funnel.
Z.AI GLM vision for scanned-PDF OCR with AI post-processing — keyless by default, each caller brings their own key per request.
DOCX edits work at the XML level inside the ZIP structure, preserving all original formatting during modifications.
From a plain request to a properly formatted file — the agent does four things, and most of it is automatic.
detect-format reads the user's intent and returns the best format — Markdown, DOCX, Excel, PDF, or PowerPoint — with a ready plan (which tool, which style preset, which category). 'Send to the client & print' → PDF; 'edit later in Word' → DOCX; 'budget tracker' → Excel; 'README' → Markdown; 'pitch deck' → PowerPoint.
You pass the whole body as a single markdown string. The server renders real structure: headings, lists, tables, code blocks, live Excel formulas (=SUM…), and clickable Tables of Contents — never a wall of plain text.
Document DNA and 8 presets style it consistently; the file is written to your own workspace when self-hosted (or pushed to your endpoint when hosted). One content string in, a polished document out.
The response flags formatting gaps and format mismatches (a table-heavy 'doc' → use Excel), and for memory-capable agents suggests a reusable memory. Every call is logged so the tool keeps improving.
Most "write a file" MCP servers hand back raw text. This one behaves like a document team that just works.
Most MCPs: Most file MCPs emit raw text or markdown and call it a .docx.
This one: Genuinely formatted DOCX, PDF, PPTX, XLSX & MD — 8 style presets, vision OCR, live Excel formulas, editable slide decks, and clickable PDF/Markdown tables of contents.
Most MCPs: Other MCPs make you pick the right tool yourself.
This one: It infers DOCX vs PDF vs Excel vs Markdown from the user's intent — and if the content fits another format better, it says so (formatSuggestion).
Most MCPs: Hosted MCPs strand created files on their own server.
This one: Output is rooted at the caller's workspace (DOC_OUTPUT_DIR / the folder your agent runs in) when self-hosted — files land where you work.
Most MCPs: Many bake in a vendor key and meter you.
This one: The server holds no global vision key — OCR uses your key, per request. No lock-in, no surprise bills.
Most MCPs: Static MCPs never improve from how you use them.
This one: Every call is logged for the maintainer, and memory-capable coding agents get nudged to remember the tool+preset that worked — so it gets sharper over time.
Most MCPs: Re-specify headers, footers, and styling on every doc.
This one: Project-wide defaults (header/footer/preset) apply automatically and even evolve from your usage patterns.
Most MCPs: Document MCPs can't verify what they read.
This one: fact-check is a cross-MCP function — it calls our web-search MCP per claim to gather cited sources, then can write a verification report. Two MCPs, one workflow.
Yes — MIT licensed, full source on GitHub. Built and maintained by LeanZero.
For most tools, no. Creating, editing, and reading digital PDF/DOCX/Excel needs no key. Only OCR of image-based (scanned) PDFs needs a Z.AI vision key, which you bring yourself — the server holds no global key.
The hosted demo is a shared, rate-limited convenience to try it fast. Self-hosting gives you full control, your own limits, and local file access — the right choice for real work.
Any MCP client: Claude Code, Claude Desktop, LM Studio, Cursor, Cline, Roo, and more. Use the remote (url + headers) shape for the hosted server, or stdio (command + args + env) when self-hosting.
Yes. CogniRunner registers it with the Anthropic MCP connector (authorization_token), so Jira agents can read and generate attachments through the hosted endpoint.
MIT licensed. Self-host it or grab a demo key, wire it into your AI client, and start reading, creating, and editing documents in minutes.