Fork me on GitHub

Ghostwriter

Maven Central

No mainstream tool offers the full range of features provided by Machai Ghostwriter out of the box.

© OpenAI

Introduction

Ghostwriter is Machai’s guidance-driven, repository-scale documentation and transformation engine.

It scans your repository (source code, docs, project-site Markdown, build metadata, and other artifacts), extracts embedded @guidance: directives, and uses a configured GenAI provider to apply consistent improvements across many files in a repeatable way. This makes it practical to keep documentation, conventions, and refactors aligned across an entire project—especially when changes must be deterministic, reviewable, and CI-friendly.

Ghostwriter is built on Guided File Processing: guidance lives next to the content it controls, and the processor composes those local directives—plus any configured defaults—into a structured prompt per file. The result is automation that remains explicit and version-controlled inside the repository.

Overview

At a high level, Ghostwriter runs as a CLI that:

  1. Resolves the project root and scan target (directory, glob:..., or regex:...).
  2. Traverses the project (Maven multi-module aware).
  3. For each supported file type, extracts @guidance: directives using pluggable reviewers.
  4. Composes an LLM request input that can include:
    • environment constraints (OS, project layout, etc.),
    • per-file guidance (or fallback default guidance),
    • optional global instruction blocks.
  5. Sends the request to the configured provider/model and applies the resulting updates.

The core value proposition is documentation and refactoring at repository scale, while keeping intent explicit via embedded guidance and preserving auditability through version control (and optional input logging).

Machai Ghostwriter vs. Other Tools

The closest mainstream tool conceptually is Claude Code: it can operate across multiple files and can be used in automated workflows. Ghostwriter, however, is purpose-built for repeatable, guidance-driven batch processing as a CLI (and Maven-friendly artifact), rather than an interactive agent primarily optimized for ad-hoc developer sessions.

Key similarities

  • Multi-file changes: both can apply edits across multiple files in a repository.
  • Automation potential: both can be used in scripted or CI workflows (Ghostwriter directly; Claude Code via your integration).
  • Repository context: both can use broader project context to produce coherent changes.

Key differences

  • Guidance-first operation: Ghostwriter extracts embedded @guidance: directives from many file types (source, docs, site content, build files) and composes them into a prompt per file.
  • Deterministic batch processing: Ghostwriter’s primary workflow is scanning a target (directory/glob:/regex:) and applying updates systematically, including Maven multi-module traversal.
  • Extensibility via reviewers and tools: file-type handling is pluggable via reviewer implementations; function tools can be attached via a loader mechanism.
  • Auditability: Ghostwriter can persist composed request inputs per file when input logging is enabled.

Brief comparison to other popular tools

  • GitHub Copilot / Tabnine / Cursor: primarily IDE/editor copilots designed for interactive completion and chat; they do not center on repository-wide, @guidance:-driven enforcement across documentation and project-site content.
  • Claude Code: closer to Ghostwriter in multi-file capabilities, but typically driven by interactive sessions rather than guidance embedded directly in the files being processed.

Summary table

Tool Project-wide automation Custom guidance embedded in files CI/CD integration Documentation generation
Machai Ghostwriter Yes Yes (@guidance:) Yes Yes
Claude Code Yes Partial (prompting/conventions) Possible Possible
GitHub Copilot Limited No Limited Partial
Cursor Limited Partial (workspace rules) Limited Partial
Tabnine Limited No Limited Limited

Machai Ghostwriter is unique because it makes version-controlled, per-file guidance the primary interface for reliable, repeatable repository-wide improvements.

Key Features

  • Processes many project file types (not just Java), including documentation and project-site Markdown.
  • Extracts embedded @guidance: directives via pluggable, file-type-aware reviewers.
  • Supports scan targets as a directory, glob: matcher, or regex: matcher.
  • Maven multi-module traversal (child modules first).
  • Optional multi-threaded module processing (when the provider is thread-safe).
  • Optional logging of composed LLM request inputs.
  • Supports global instructions and default guidance loaded from plain text, URLs, or local files.

Getting Started

Prerequisites

  • Java
    • Build target: Java 8 (from pom.xml: maven.compiler.release=8).
    • Runtime: depends on your selected GenAI provider/client; you can run with a newer JRE if required by the provider SDK while still building at the configured release level.
  • GenAI provider access and credentials as required by your provider (for example via GW_HOME\\gw.properties, environment variables, or provider-specific configuration).
  • Network access to the provider endpoint (if applicable).

Download

Download

Basic Usage

java -jar gw.jar <scanTarget> [options]

Example (scan a folder on Windows):

java -jar gw.jar src\\main\\java

Typical Workflow

  1. Add @guidance: directives to the files you want Ghostwriter to update (Markdown under src\\site, Java sources, templates, etc.).
  2. Choose a scan target:
    • directory path (relative to the project), or
    • glob: matcher (example: glob:**/*.java), or
    • regex: matcher.
  3. Configure your GenAI provider/model and credentials.
  4. Optionally add global instructions and/or default guidance.
  5. Run Ghostwriter, then review and commit the results.

Configuration

Command-Line Options

The CLI options are defined in org.machanism.machai.gw.processor.Ghostwriter:

  • -h, --help — Show help and exit.
  • -t, --threads[=<true|false>] — Enable multi-threaded processing. If specified without a value, it enables multi-threading.
  • -a, --genai <provider:model> — Set the GenAI provider and model (example: OpenAI:gpt-5.1).
  • -i, --instructions[=<text|url|file:...>] — Provide global system instructions. When used without a value, you are prompted to enter multi-line text via stdin.
  • -g, --guidance[=<text|url|file:...>] — Provide default guidance (fallback). When used without a value, you are prompted to enter multi-line text via stdin.
  • -e, --excludes <csv> — Comma-separated list of directories to exclude from processing.
  • -l, --logInputs — Log composed LLM request inputs to dedicated log files.

Notes on --instructions and --guidance values:

  • blank lines are preserved,
  • lines beginning with http:// or https:// are fetched and included,
  • lines beginning with file: are read from the referenced file and included,
  • other lines are included as-is.

Options Table

Option Argument Description Default
-h, --help none Show help message and exit. n/a
-t, --threads true/false (optional) Enable multi-threaded module processing; if used without a value, it enables it. From config key threads (default false).
-a, --genai provider:model GenAI provider/model identifier. From config key genai; otherwise OpenAI:gpt-5-mini.
-i, --instructions text/url/file (optional) Global instructions appended to every prompt; supports http(s)://... and file:...; prompts via stdin if no value. From config key instructions; otherwise none.
-g, --guidance text/url/file (optional) Fallback guidance used when files have no embedded @guidance:; supports http(s)://... and file:...; prompts via stdin if no value. From config key guidance; otherwise none.
-e, --excludes csv Comma-separated exclude list (also configurable via config key excludes). From config key excludes; otherwise none.
-l, --logInputs none Log composed LLM inputs to per-file log files. From config key logInputs (default false).

Example

The built-in help text documents supported scan targets:

  • raw directory names,
  • glob: patterns (for example glob:**/*.java),
  • regex: patterns.

Example (Windows): scan Java sources via glob, enable threads, set provider/model, add instructions and default guidance, exclude common folders, and log inputs:

java -jar gw.jar "glob:**/*.java" -t -a OpenAI:gpt-5.1 -i file:project-instructions.txt -g file:default-guidance.txt -e target,.git -l

Default Guidance

defaultGuidance is a fallback instruction block used when a file does not contain embedded @guidance: directives.

It can be set via:

  • CLI: -g / --guidance (plain text, http(s)://..., or file:...; supports stdin when provided without a value), or
  • API: FileProcessor#setDefaultGuidance(String).

The value is treated as plain text, but each line may also act as an include directive:

  • blank lines are preserved,
  • lines beginning with http:// or https:// are fetched and included,
  • lines beginning with file: are read from the referenced file and included,
  • other lines are included as-is.

This allows Ghostwriter to still process supported files meaningfully even when no per-file guidance is present.

Resources