2026 PDF Skills レポート：Codex、Claude Code、OpenClaw、Trae 向け導入・選定ガイド

要点

2026年4月15日時点で重要なのは、どのPDFアプリが最強かではなく、どのPDF skillが実際のエージェント基盤に安全に導入され、本番運用に乗るかです。

主要判断

2026-04-15時点で、OpenClaw と Trae Agent は明確なオープンソース実装を持ち、Claude Code も公式 GitHub repo を公開しています。Codex CLI はオープンソースですが、Codex app / cloud は依然としてマネージド製品層です。
PDF ワークフローでは、CLI / Python / Java ライブラリが 4 種の agent に最も横断的に使える導入形態です。MCP は Claude Code と Trae で特に強く、Codex と OpenClaw ではラッパー層として使う方が安定します。
GUI 型 PDF ツールも agent で使えますが、CLI / API に比べると安定性と監査性で不利になりやすいです。

エージェント導入

OpenClaw、Claude Code、Codex、TraeでPDF skillsを導入する方法

以下は2026-04-15時点で公式に確認できた情報に基づき、ネイティブ対応、ラッパー経由、GUI中心という違いを分けて整理しています。

Codex

Install the CLI or IDE first, then package PDF tooling with AGENTS.md and skills

ハイブリッド

向いている用途: PDF automation inside codebases, batch scripts, document understanding, and translation-delivery workflows

導入形態

Install Codex CLI with `npm install -g @openai/codex` or `brew install --cask codex`.
Add `AGENTS.md` at the repo root and document the PDF workflow, test commands, and permission boundaries.
Package PDF tooling as repo scripts such as `tools/pdf/run_ocr.sh` or `tools/pdf/parse_docling.py`.
If you use Codex skills in the app, CLI, or IDE, bundle instructions, resources, and scripts as reusable skills.

能力と限界

Public documentation now confirms both Skills and AGENTS.md.
The CLI is open-source, while the app and cloud remain managed product surfaces.
The safest integration path for PDF skills is still repo-level scripts plus AGENTS.md.

OpenAI Codex product page OpenAI Codex CLI getting started openai/codex Introducing Codex Introducing the Codex app

Claude Code

Four official extension paths: skills, plugins, MCP, and CLAUDE.md

ハイブリッド

向いている用途: Teams that want the clearest official extension model and reusable PDF skills as durable team assets

導入形態

Install Claude Code with `curl -fsSL https://claude.ai/install.sh | bash`, or use Homebrew / WinGet.
Native skills live in `~/.claude/skills/<skill>/SKILL.md` or project-level `.claude/skills/<skill>/SKILL.md`.
Use `claude mcp add ...` to connect local stdio servers, remote HTTP servers, or OAuth-backed tools.
Bundle skills, agents, hooks, and MCP servers into plugins when you need a sharable team distribution format.

能力と限界

Its public documentation is the most explicit across skills, plugins, and MCP.
It works well for both native skills and MCP wrappers around OCR, parsing, translation, and RAG services.
The core CLI has an official GitHub repo, while the model and service layer remain proprietary.

Claude Code overview Claude Code skills Claude Code MCP Claude Code plugins reference anthropics/claude-code

OpenClaw

Workspace skills, plugins, ClawHub, and a gateway make it the closest thing to a personal agent OS

オープン

向いている用途: Personal AI assistants, multi-channel automation, and mixed browser / desktop / shell PDF workflows

導入形態

Install OpenClaw from the official repo or install script; the main runtime entry is the Gateway.
Workspace skills live in `~/.openclaw/workspace/skills/<skill>/SKILL.md`.
Use plugins when a PDF workflow also needs channels, host integrations, or system capabilities.
If ClawHub is enabled, agents can search and fetch skills, but production setups should still whitelist and review them.

能力と限界

The official README documents workspace roots, skills paths, and ClawHub behavior clearly.
It is stronger than pure IDE agents when browser, desktop, or host-command automation matters.
It is also the most open, which means permission control and supply-chain review matter more.

openclaw/openclaw OpenClaw docs ClawHub

Trae

Trae IDE uses Agent Skills, @Agent, and MCP; the OSS Trae Agent uses YAML plus MCP

ハイブリッド

向いている用途: IDE collaboration, multi-agent orchestration, and research-oriented or extensible coding-agent setups

導入形態

Install Trae IDE or SOLO from the official download page when you want the desktop product surface.
The official Trae blog already documents Agent Skills creation, import, usage, and MCP support through `@Agent`.
For the open-source agent path, use `git clone https://github.com/bytedance/trae-agent.git && uv sync --all-extras`.
Add `mcp_servers` in the `trae-agent` config to attach external PDF skills and document tools.

能力と限界

Trae Agent has an official MIT-licensed GitHub repo.
The Trae IDE and SOLO product surfaces publicly point to Agent Skills and MCP usage.
Use the open agent when you need tighter control; add the IDE when visual workflows matter.

Trae download Trae blog bytedance/trae-agent

互換性

どのエージェントがどのPDF skillパッケージ形態を使えるか

『PDFを扱える』ことと『あらゆるPDF skillを扱える』ことは同じではありません。実運用では、skillの導入形態が互換性を左右します。

導入形態	Codex	Claude Code	OpenClaw	Trae	判定
Native skills / commands	直接利用可	ネイティブ	ネイティブ	直接利用可	Claude Code is the clearest today; OpenClaw has workspace skills; Trae has Agent Skills; Codex publicly confirms Skills but exposes fewer file-system details.
Repo rules files (AGENTS.md / CLAUDE.md / Rules)	ネイティブ	直接利用可	直接利用可	直接利用可	All four agent families can consume this layer; it is the most portable and least coupled way to inject team knowledge.
CLI / Python / Java libraries	直接利用可	直接利用可	直接利用可	直接利用可	This is the most reusable packaging form across agent families and the best first layer to deploy.
MCP server	ラッパー経由	ネイティブ	ラッパー経由	直接利用可	Claude Code is strongest natively; Trae also points clearly to MCP; Codex and OpenClaw usually benefit from MCP through wrappers, plugins, or gateways.
SaaS API / cloud service	直接利用可	直接利用可	直接利用可	直接利用可	All four agent families can use this layer reliably when API keys are governed and packaged as tools or scripts.
Desktop GUI / RPA	限定的	限定的	直接利用可	ラッパー経由	OpenClaw is friendlier to browser and desktop control; Codex and Claude Code should not treat GUI automation as the primary path.

カタログ

36のPDF skills / tools：オープンソース、クローズド、GitHub、導入形態

ここではskillsを、実際に導入できる構成要素として整理しています。オープンソースはGitHubを、クローズド製品は公式入口を掲載しています。

Skill / Tool	カテゴリ	公開形態	導入形態	GitHub / 公式	向いている用途	メモ
Tesseract OCR	OCR	オープン	CLI / ライブラリ	GitHub Official	汎用OCR多言語OCR	ローカルOSS基盤に向く前処理品質の影響が大きい
OCRmyPDF	OCR	オープン	CLI / ライブラリ	GitHub Official	検索可能PDF出力agent前処理	ローカルOSS基盤に向く本番運用でよく使われる
PaddleOCR	OCR	オープン	CLI / ライブラリ	GitHub Official	多言語OCR企業フォームと契約	中国語ワークフローに強い本番運用でよく使われる
docTR	OCR	オープン	CLI / ライブラリ	GitHub Official	汎用OCR企業フォームと契約	研究用途に向く前処理品質の影響が大きい
Docling	PDF解析	オープン	CLI / ライブラリ	GitHub Official	LLM向け構造化複雑レイアウト	パイプライン基盤に向くMCPと特に相性が良い
docling-mcp	PDF解析	オープン	MCP	GitHub Official	MCP連携向きLLM向け構造化	MCPと特に相性が良いパイプライン基盤に向く
GROBID	PDF解析	オープン	CLI / ライブラリ	GitHub Official	学術論文研究・技術PDF	研究用途に向く本番運用でよく使われる
Nougat	PDF解析	オープン	CLI / ライブラリ	GitHub Official	学術論文数式の多い文書	研究用途に向く汎用OCRではない
MinerU	PDF解析	オープン	CLI / ライブラリ	GitHub Official	複雑レイアウト数式の多い文書	複雑レイアウトに強い本番運用でよく使われる
PyMuPDF	PDF操作	オープン	CLI / ライブラリ	GitHub Official	高性能ランタイム軽量PDF操作	本番運用でよく使われるパイプライン基盤に向く
PyMuPDF4LLM	PDF操作	オープン	CLI / ライブラリ	GitHub Official	agent前処理LLM向け構造化	パイプライン基盤に向く本番運用でよく使われる
pypdf	PDF操作	オープン	CLI / ライブラリ	GitHub Official	軽量PDF操作PDF構造操作	Pure Python向きパイプライン基盤に向く
pdfplumber	表抽出	オープン	CLI / ライブラリ	GitHub Official	表デバッグテキスト表	デバッグしやすいパイプライン基盤に向く
Unstructured	文書ETL	オープン	CLI / ライブラリ	GitHub Official	文書チャンク化agent前処理	パイプライン基盤に向くチーム利用に向く
unstructured-api	文書ETL	オープン	SaaS API	GitHub Official	内部API層文書チャンク化	API-firstチーム利用に向く
Tabula	表抽出	オープン	CLI / ライブラリ	GitHub Official	テキスト表バッチ表抽出	汚れたスキャンには弱い本番運用でよく使われる
tabula-java	表抽出	オープン	CLI / ライブラリ	GitHub Official	バッチ表抽出Javaエンタープライズ	本番運用でよく使われるパイプライン基盤に向く
qpdf	PDF操作	オープン	CLI / ライブラリ	GitHub Official	PDF構造操作バッチ後処理	本番運用でよく使われるパイプライン基盤に向く
pdfcpu	PDF操作	オープン	CLI / ライブラリ	GitHub Official	バッチ後処理PDF構造操作	本番運用でよく使われるパイプライン基盤に向く
Apache PDFBox	PDF操作	オープン	CLI / ライブラリ	GitHub Official	JavaエンタープライズPDF構造操作	本番運用でよく使われるチーム利用に向く
OpenAI PDF Files	RAG / 推論	クローズド	SaaS API	Official	PDF推論文書横断検索	API-firstレイアウト再現より理解向き
OpenAI File Search	RAG / 推論	クローズド	SaaS API	Official	文書横断検索チーム知識検索	API-firstチーム利用に向く
Claude PDF Support	RAG / 推論	クローズド	SaaS API	Official	PDF推論研究・技術PDF	API-firstレイアウト再現より理解向き
Claude Citations	知識Q&A	クローズド	SaaS API	Official	根拠付き回答チーム知識検索	API-firstチーム利用に向く
Mistral OCR	エンタープライズ文書AI	クローズド	SaaS API	Official	クラウドOCR API複雑レイアウト	API-firstコストとベンダー依存が増える
Mathpix PDF to Markdown	PDF解析	クローズド	SaaS API	Official	数式の多い文書学術論文	研究用途に向くコストとベンダー依存が増える
Google Document AI	エンタープライズ文書AI	クローズド	SaaS API	Official	企業フォームと契約内部API層	企業向け寄りAPI-first
Azure Document Intelligence	エンタープライズ文書AI	クローズド	SaaS API	Official	企業フォームと契約クラウドOCR API	企業向け寄りAPI-first
Amazon Textract	エンタープライズ文書AI	クローズド	SaaS API	Official	企業フォームと契約クラウドOCR API	企業向け寄りAPI-first
Adobe Acrobat AI Assistant	デスクトップPDF	クローズド	デスクトップ GUI / RPA	Official	デスクトップレビューチーム知識検索	GUI-firstラッパー自動化が必要になりやすい
Adobe Translate PDF	翻訳	クローズド	デスクトップ GUI / RPA	Official	デスクトップ翻訳フロー多言語納品	GUI-first高価値な翻訳レイヤー
ABBYY FineReader PDF	デスクトップPDF	クローズド	デスクトップ GUI / RPA	Official	デスクトップOCRと校正検索可能PDF出力	GUI-first企業向け寄り
Nanonets	請求書自動化	クローズド	SaaS API	Official	請求書とレシート内部API層	API-first企業向け寄り
Rossum	請求書自動化	クローズド	SaaS API	Official	請求書とレシート企業フォームと契約	企業向け寄りAPI-first
Parseur	テンプレート抽出	クローズド	SaaS API	Official	テンプレート抽出内部API層	API-first本番運用でよく使われる
Reflo	翻訳	クローズド	SaaS API	Official	多言語納品デスクトップ翻訳フロー	高価値な翻訳レイヤー複雑レイアウトに強い
DeepL Files + Glossary	翻訳	クローズド	SaaS API	Official	用語集主導翻訳多言語納品	高価値な翻訳レイヤーチーム利用に向く
Smallpdf Translate PDF	翻訳	クローズド	デスクトップ GUI / RPA	Official	軽量な簡易翻訳デスクトップ翻訳フロー	GUI-firstラッパー自動化が必要になりやすい
iLovePDF Translate PDF	翻訳	クローズド	デスクトップ GUI / RPA	Official	軽量な簡易翻訳デスクトップ翻訳フロー	GUI-firstラッパー自動化が必要になりやすい
PDFgear ChatPDF	知識Q&A	クローズド	デスクトップ GUI / RPA	Official	デスクトップPDFチャットPDF推論	GUI-firstラッパー自動化が必要になりやすい
UPDF Chat with PDF	知識Q&A	クローズド	デスクトップ GUI / RPA	Official	デスクトップPDFチャットPDF推論	GUI-firstラッパー自動化が必要になりやすい
AskYourPDF	知識Q&A	クローズド	SaaS API	Official	PDF推論チーム知識検索	API-firstレイアウト再現より理解向き
Humata	知識Q&A	クローズド	SaaS API	Official	チーム知識検索文書横断検索	API-firstチーム利用に向く

導入設計

本番向けPDFエージェントは単品ではなくスタックで考える

実運用に耐えるPDFエージェントは、エージェント本体、PDF skills、ラッパー層、権限制御、サンプル文書による回帰テストを組み合わせて構成する必要があります。

Blueprint A: Local-first open-source PDF agent baseline

適したチーム: Privacy-sensitive teams that want control, lower vendor dependency, and are willing to operate their own stack

推奨スタック

Agent: Claude Code or OpenClaw, with Trae Agent OSS as a strong alternative
OCR: Tesseract + OCRmyPDF + PaddleOCR
Parsing: Docling / MinerU / GROBID / Nougat
Operations: PyMuPDF + pypdf + qpdf + pdfcpu
Tables: pdfplumber + Tabula / tabula-java

導入手順

Install PDF capabilities first as CLI tools and Python scripts instead of starting with GUI products.
Package those scripts as reusable skills for each agent family: `.claude/skills`, OpenClaw workspace skills, Trae Agent Skills or YAML, and Codex repo scripts plus AGENTS.md.
Prepare 5 to 10 sample documents per document type and run regression checks for OCR, tables, formulas, and reading order.

主なリスク

Self-hosted stacks cost more to maintain than SaaS layers.
Accuracy can drop on complex layouts and low-resource languages.
Permissions, logging, and regression governance remain your responsibility.

Blueprint B: Enterprise API-centered PDF agent platform

適したチーム: Enterprises that already run cloud infrastructure and care about SLA, auditability, identity, and compliance

推奨スタック

Agent: Claude Code or Trae, with Codex covering the code and automation layer
OCR / extraction: Google Document AI / Azure Document Intelligence / Amazon Textract
Knowledge layer: OpenAI PDF Files + File Search or Claude PDF + citations
Business flow tools: Nanonets / Rossum / Parseur
Post-processing: qpdf / pypdf / PyMuPDF

導入手順

Wrap closed cloud services behind internal APIs or MCP wrappers instead of wiring every vendor directly into the agent.
Route contracts, invoices, research PDFs, and branded collateral through different queues rather than sharing a single prompt chain.
Put permissions and audit controls in the orchestration layer, not inside prompts.

主なリスク

Vendor lock-in and cost growth remain real risks.
API output structures may drift after model or service upgrades.
Cross-border data flow and compliance boundaries must be reviewed in advance.

Blueprint C: Multilingual PDF delivery stack

適したチーム: Teams handling papers, manuals, contracts, overseas sales assets, and multilingual branded materials

推奨スタック

Agent: Codex or Claude Code for orchestration, batching, review, and download flows
Delivery translation layer: Reflo
Terminology layer: DeepL Glossary or an internal termbase
Post-processing ecosystem: Adobe Acrobat / Adobe Translate PDF
Quality control: PyMuPDF / qpdf / pdfcpu

導入手順

Define termbases, language pairs, and document classes before letting the agent run batch orchestration.
Route high-value files through the Reflo / DeepL / Adobe combination and reserve lighter products for lower-risk content.
Keep a human side-by-side review step before any customer-facing delivery.

主なリスク

The closed translation layer costs more than a purely open-source stack.
Complex PDFs still require sampled human QA.
Errors in branded materials and contracts are expensive, so review gates remain mandatory.

方法論

方法とエビデンス基準

確認日: 2026-04-15
情報源の種類: 公式製品ページ、公式GitHub、ヘルプセンター、開発者向けドキュメント、導入ドキュメント
調査対象: 4つのエージェント基盤、36のPDF skills / tools、6つの導入形態、3つの実装ブループリント

一次証拠として採用したのは、公式プロダクトページ、公式 GitHub repo、公式ヘルプセンター、公式開発者向けドキュメントのみです。
導入可能性は、native skills、repo rules、CLI / libraries、MCP、SaaS API、GUI / RPA の6形態に分解して評価しました。
agent 互換性はマーケティング文言ではなく、skills、commands、plugins、MCP、workspace files、CLI、API などの公式公開有無で判断しました。
Codex では Skills と AGENTS.md の存在が確認できていますが、native skills の公開仕様は Claude Code ほど詳細ではないため、一部は実装ガイダンスとして扱っています。

ソース

公式ソース一覧

EEATを担保するため、引用元は公式ドメイン、公式GitHub、公式ヘルプセンター、公式開発者ドキュメントを優先しています。推定は明示的に区別しています。

Codex

Claude Code

OpenClaw

Trae

Open-source PDF stack

Closed / cloud PDF stack

FAQ

よくある質問

Claude Code はオープンソースですか、それともクローズドですか？

2026-04-15 時点では公式 GitHub repo があり、CLI は公開されています。一方でモデルとサービス層は依然としてプロプライエタリです。

Codex は Claude Code のように PDF skills を直接導入できますか？

できますが、最も安定した公開パターンは AGENTS.md と repo scripts と PDF CLI / API ツールの組み合わせです。

OpenClaw は GUI 型の PDF ツールと相性が良いですか？

良いです。特にブラウザやデスクトップ連携が必要な場合に強みがありますが、CLI / API より安定しないことがあります。

Trae はオープンソースですか？

Trae Agent の GitHub repo は MIT で公開されていますが、Trae IDE / SOLO は商用プロダクト面としての性格が強いです。

最小構成の PDF agent を作るなら何から入れるべきですか？

OCRmyPDF、Docling または MinerU、PyMuPDF / pypdf、qpdf を先に入れ、必要に応じて OpenAI / Claude / Reflo / DeepL を重ねるのが無難です。

最終提言

まず導入形態を選び、次にPDF skillを選び、最後にモデルを選ぶ

2026年のPDF agent設計では、CLI / API / MCPとして導入できること、監査性、権限制御が、モデル名そのものより重要です。多言語PDF納品ではReflo + DeepL / Adobeが有力で、ローカルOSS基盤としてはOCRmyPDF、Docling、MinerU、PyMuPDF、qpdfが実用的です。

Refloを開く料金を見る

2026 PDF Skills レポート：Codex、Claude Code、OpenClaw、Trae 向け導入・選定ガイド

最重要ポイント

ネイティブ対応が最強

最も実用的な OSS 基盤

高価値なクローズド層

OpenClaw、Claude Code、Codex、TraeでPDF skillsを導入する方法

Codex

導入形態

能力と限界

Claude Code

導入形態

能力と限界

OpenClaw

導入形態

能力と限界

Trae

導入形態

能力と限界

どのエージェントがどのPDF skillパッケージ形態を使えるか

36のPDF skills / tools：オープンソース、クローズド、GitHub、導入形態

本番向けPDFエージェントは単品ではなくスタックで考える

Blueprint A: Local-first open-source PDF agent baseline

推奨スタック

導入手順

主なリスク

Blueprint B: Enterprise API-centered PDF agent platform

推奨スタック

導入手順

主なリスク

Blueprint C: Multilingual PDF delivery stack

推奨スタック

導入手順

主なリスク

方法とエビデンス基準

公式ソース一覧

Codex

Claude Code

OpenClaw

Trae

Open-source PDF stack

Closed / cloud PDF stack

よくある質問

まず導入形態を選び、次にPDF skillを選び、最後にモデルを選ぶ