Ghostwriter
No mainstream tool offers the full range of features provided by Machai Ghostwriter out of the box.
― © OpenAI
Introduction
Ghostwriter is Machai’s guidance-driven, repository-scale documentation and transformation engine.
It scans your repository (source code, docs, project-site Markdown, build metadata, and other artifacts), extracts embedded @guidance: directives, and uses a configured GenAI provider to apply consistent improvements across many files in a repeatable way. This makes it practical to keep documentation, conventions, and refactors aligned across an entire project—especially when changes must be deterministic, reviewable, and CI-friendly.
Ghostwriter is built on Guided File Processing: guidance lives next to the content it controls, and the processor composes those local directives—plus any configured defaults—into a structured prompt per file. The result is automation that remains explicit and version-controlled inside the repository.
Overview
At a high level, Ghostwriter runs as a CLI that:
- Resolves the project root and scan target (directory,
glob:..., orregex:...). - Traverses the project (Maven multi-module aware).
- For each supported file type, extracts
@guidance:directives using pluggable reviewers. - Composes an LLM request input that can include:
- environment constraints (OS, project layout, etc.),
- per-file guidance (or fallback default guidance),
- optional global instruction blocks.
- Sends the request to the configured provider/model and applies the resulting updates.
The core value proposition is documentation and refactoring at repository scale, while keeping intent explicit via embedded guidance and preserving auditability through version control (and optional input logging).
Machai Ghostwriter vs. Other Tools
The closest mainstream tool conceptually is Claude Code: it can operate across multiple files and can be used in automated workflows. Ghostwriter, however, is purpose-built for repeatable, guidance-driven batch processing as a CLI (and Maven-friendly artifact), rather than an interactive agent primarily optimized for ad-hoc developer sessions.
Key similarities
- Multi-file changes: both can apply edits across multiple files in a repository.
- Automation potential: both can be used in scripted or CI workflows (Ghostwriter directly; Claude Code via your integration).
- Repository context: both can use broader project context to produce coherent changes.
Key differences
- Guidance-first operation: Ghostwriter extracts embedded
@guidance:directives from many file types (source, docs, site content, build files) and composes them into a prompt per file. - Deterministic batch processing: Ghostwriter’s primary workflow is scanning a target (directory/
glob:/regex:) and applying updates systematically, including Maven multi-module traversal. - Extensibility via reviewers and tools: file-type handling is pluggable via reviewer implementations; function tools can be attached via a loader mechanism.
- Auditability: Ghostwriter can persist composed request inputs per file when input logging is enabled.
Brief comparison to other popular tools
- GitHub Copilot / Tabnine / Cursor: primarily IDE/editor copilots designed for interactive completion and chat; they do not center on repository-wide,
@guidance:-driven enforcement across documentation and project-site content. - Claude Code: closer to Ghostwriter in multi-file capabilities, but typically driven by interactive sessions rather than guidance embedded directly in the files being processed.
Summary table
| Tool | Project-wide automation | Custom guidance embedded in files | CI/CD integration | Documentation generation |
|---|---|---|---|---|
| Machai Ghostwriter | Yes | Yes (@guidance:) |
Yes | Yes |
| Claude Code | Yes | Partial (prompting/conventions) | Possible | Possible |
| GitHub Copilot | Limited | No | Limited | Partial |
| Cursor | Limited | Partial (workspace rules) | Limited | Partial |
| Tabnine | Limited | No | Limited | Limited |
Machai Ghostwriter is unique because it makes version-controlled, per-file guidance the primary interface for reliable, repeatable repository-wide improvements.
Key Features
- Processes many project file types (not just Java), including documentation and project-site Markdown.
- Extracts embedded
@guidance:directives via pluggable, file-type-aware reviewers. - Supports scan targets as a directory,
glob:matcher, orregex:matcher. - Maven multi-module traversal (child modules first).
- Optional multi-threaded module processing (when the provider is thread-safe).
- Optional logging of composed LLM request inputs.
- Supports global instructions and default guidance loaded from plain text, URLs, or local files.
Getting Started
Prerequisites
- Java
- Build target: Java 8 (from
pom.xml:maven.compiler.release=8). - Runtime: depends on your selected GenAI provider/client; you can run with a newer JRE if required by the provider SDK while still building at the configured release level.
- Build target: Java 8 (from
- GenAI provider access and credentials as required by your provider (for example via
GW_HOME\\gw.properties, environment variables, or provider-specific configuration). - Network access to the provider endpoint (if applicable).
Download
Basic Usage
java -jar gw.jar <scanTarget> [options]
Example (scan a folder on Windows):
java -jar gw.jar src\\main\\java
Typical Workflow
- Add
@guidance:directives to the files you want Ghostwriter to update (Markdown undersrc\\site, Java sources, templates, etc.). - Choose a scan target:
- directory path (relative to the project), or
glob:matcher (example:glob:**/*.java), orregex:matcher.
- Configure your GenAI provider/model and credentials.
- Optionally add global instructions and/or default guidance.
- Run Ghostwriter, then review and commit the results.
Configuration
Command-Line Options
The CLI options are defined in org.machanism.machai.gw.processor.Ghostwriter:
-h,--help— Show help and exit.-t,--threads[=<true|false>]— Enable multi-threaded processing. If specified without a value, it enables multi-threading.-a,--genai <provider:model>— Set the GenAI provider and model (example:OpenAI:gpt-5.1).-i,--instructions[=<text|url|file:...>]— Provide global system instructions. When used without a value, you are prompted to enter multi-line text via stdin.-g,--guidance[=<text|url|file:...>]— Provide default guidance (fallback). When used without a value, you are prompted to enter multi-line text via stdin.-e,--excludes <csv>— Comma-separated list of directories to exclude from processing.-l,--logInputs— Log composed LLM request inputs to dedicated log files.
Notes on --instructions and --guidance values:
- blank lines are preserved,
- lines beginning with
http://orhttps://are fetched and included, - lines beginning with
file:are read from the referenced file and included, - other lines are included as-is.
Options Table
| Option | Argument | Description | Default |
|---|---|---|---|
-h, --help |
none | Show help message and exit. | n/a |
-t, --threads |
true/false (optional) |
Enable multi-threaded module processing; if used without a value, it enables it. | From config key threads (default false). |
-a, --genai |
provider:model |
GenAI provider/model identifier. | From config key genai; otherwise OpenAI:gpt-5-mini. |
-i, --instructions |
text/url/file (optional) | Global instructions appended to every prompt; supports http(s)://... and file:...; prompts via stdin if no value. |
From config key instructions; otherwise none. |
-g, --guidance |
text/url/file (optional) | Fallback guidance used when files have no embedded @guidance:; supports http(s)://... and file:...; prompts via stdin if no value. |
From config key guidance; otherwise none. |
-e, --excludes |
csv | Comma-separated exclude list (also configurable via config key excludes). |
From config key excludes; otherwise none. |
-l, --logInputs |
none | Log composed LLM inputs to per-file log files. | From config key logInputs (default false). |
Example
The built-in help text documents supported scan targets:
- raw directory names,
glob:patterns (for exampleglob:**/*.java),regex:patterns.
Example (Windows): scan Java sources via glob, enable threads, set provider/model, add instructions and default guidance, exclude common folders, and log inputs:
java -jar gw.jar "glob:**/*.java" -t -a OpenAI:gpt-5.1 -i file:project-instructions.txt -g file:default-guidance.txt -e target,.git -l
Default Guidance
defaultGuidance is a fallback instruction block used when a file does not contain embedded @guidance: directives.
It can be set via:
- CLI:
-g/--guidance(plain text,http(s)://..., orfile:...; supports stdin when provided without a value), or - API:
FileProcessor#setDefaultGuidance(String).
The value is treated as plain text, but each line may also act as an include directive:
- blank lines are preserved,
- lines beginning with
http://orhttps://are fetched and included, - lines beginning with
file:are read from the referenced file and included, - other lines are included as-is.
This allows Ghostwriter to still process supported files meaningfully even when no per-file guidance is present.
Resources
- Official platform: https://machai.machanism.org/ghostwriter/
- GitHub (SCM): https://github.com/machanism-org/machai
- Maven Central: https://central.sonatype.com/artifact/org.machanism.machai/ghostwriter

