第3章 CLI 启动与性能优化

"The cheapest, fastest, and most reliable components are those that aren't there." -- Gordon Bell

本章要点

三阶段启动模型 -- Claude Code 将启动过程拆分为副作用阶段、导入阶段和 CLI 分发阶段，每个阶段都有针对性的优化策略
并行预加载模式 -- 利用 JavaScript 模块求值的同步特性，在 import 语句执行期间并行运行子进程，实现"零成本"预加载
编译期特性标志 -- 通过 Bun bundler 的 feature() 机制实现死代码消除，使外部构建产物不包含内部功能的任何字节
快速路径设计 -- 对 --version、--dump-system-prompt、daemon worker 等场景设计零依赖或最小依赖的快速退出路径
端到端性能度量 -- 基于 profileCheckpoint 的全链路性能分析体系，支持采样上报和详细本地报告两种模式

引言：毫秒之战

一个 CLI 工具的启动时间，直接决定了用户的第一印象。当开发者在终端输入 claude 并按下回车，到看到交互式提示符，中间经历的每一毫秒都在消耗用户的耐心。对于 Claude Code 这样一个需要加载大量模块、读取多处配置、建立网络连接的复杂工程来说，启动性能优化是一场精密的毫秒之战。

Claude Code 的启动优化并非事后补救，而是从架构层面就融入了设计。它的核心思想可以概括为：让一切可以并行的操作并行执行，让一切不必要的代码不出现在产物中，让一切可以延迟的初始化推迟到真正需要时。

本章将沿着一次完整的 CLI 启动流程，从用户敲下 claude 命令开始，逐步剖析每个阶段的设计决策和优化手段。

3.1 启动的三个阶段

Claude Code 的启动过程被精心划分为三个阶段，每个阶段有不同的目标和约束。理解这三个阶段，是理解后续所有优化策略的基础。

以下流程图展示了 CLI 启动的三阶段模型及其内部并行关系：

3.1.1 入口分层架构

在深入三个阶段之前，有必要先了解 Claude Code 的入口分层架构。CLI 的执行并不是从一个单一的巨型文件开始的，而是经过两层分发：

cli.tsx (薄入口层)
  |-- 快速路径: --version, --dump-system-prompt, --daemon-worker, bridge, daemon ...
  |-- 主路径: 加载 main.tsx
        |-- 副作用阶段 (Side-Effect Phase)
        |-- 导入阶段 (Import Phase)
        |-- CLI 分发阶段 (CLI Dispatch Phase)

cli.tsx 是最外层的入口（位于 src/entrypoints/cli.tsx），它的设计哲学是：尽可能少地加载模块，尽可能快地判断是否可以短路返回。只有当请求确实需要完整的交互式 CLI 时，才会加载体量庞大的 main.tsx。

3.1.2 副作用阶段（Side-Effect Phase）

副作用阶段是 main.tsx 文件的前 20 行，也是整个启动过程中最精妙的部分。我们直接看源码：

typescript

// 文件: src/main.tsx (第1-20行)

// These side-effects must run before all other imports:
// 1. profileCheckpoint marks entry before heavy module evaluation begins
// 2. startMdmRawRead fires MDM subprocesses (plutil/reg query) so they run in
//    parallel with the remaining ~135ms of imports below
// 3. startKeychainPrefetch fires both macOS keychain reads (OAuth + legacy API
//    key) in parallel — isRemoteManagedSettingsEligible() otherwise reads them
//    sequentially via sync spawn inside applySafeConfigEnvironmentVariables()
//    (~65ms on every macOS startup)
import { profileCheckpoint, profileReport } from './utils/startupProfiler.js';

// eslint-disable-next-line custom-rules/no-top-level-side-effects
profileCheckpoint('main_tsx_entry');
import { startMdmRawRead } from './utils/settings/mdm/rawRead.js';

// eslint-disable-next-line custom-rules/no-top-level-side-effects
startMdmRawRead();
import { ensureKeychainPrefetchCompleted, startKeychainPrefetch }
  from './utils/secureStorage/keychainPrefetch.js';

// eslint-disable-next-line custom-rules/no-top-level-side-effects
startKeychainPrefetch();

这段代码的结构乍看古怪 -- 为什么函数调用和 import 语句交替出现？这正是它的精髓所在。

在 JavaScript 中，import 语句是同步的、顺序求值的。当运行时遇到 import { startMdmRawRead } from './utils/settings/mdm/rawRead.js' 时，它会立即加载并执行 rawRead.js 模块的全部代码，然后才继续执行下一行。利用这个特性，Claude Code 在每个 import 之后立即调用刚导入的函数，让异步子进程在后台启动，然后继续加载下一批模块。

这三个操作按精确的顺序排列：

profileCheckpoint('main_tsx_entry') -- 记录一个时间戳锚点，后续所有计时都以此为参照
startMdmRawRead() -- 立即启动 MDM（Mobile Device Management）配置读取子进程
startKeychainPrefetch() -- 立即启动 macOS Keychain 读取子进程

之所以它们必须在"所有其他 import 之前"执行（代码注释中的 "must run before all other imports"），是因为后续的 import 语句链（第 22-206 行）将触发大量模块求值，耗时约 135ms。这 135ms 就是一个天然的"空闲窗口" -- 子进程可以在这个窗口中完成它们的 I/O 操作，实现真正的零成本预加载。

3.1.3 导入阶段（Import Phase）

副作用阶段结束后，紧接着是一长串的 import 语句，从第 21 行一直延伸到第 206 行：

typescript

// 文件: src/main.tsx (第21-209行，节选)

import { feature } from 'bun:bundle';
import { Command as CommanderCommand, InvalidArgumentError, Option }
  from '@commander-js/extra-typings';
import chalk from 'chalk';
import { readFileSync } from 'fs';
import mapValues from 'lodash-es/mapValues.js';
// ... (约185行 import 语句)
import { shouldEnableThinkingByDefault, type ThinkingConfig }
  from './utils/thinking.js';
import { initUser, resetUserCache } from './utils/user.js';

// eslint-disable-next-line custom-rules/no-top-level-side-effects
profileCheckpoint('main_tsx_imports_loaded');

注意最后一行 profileCheckpoint('main_tsx_imports_loaded') -- 这标记了导入阶段的结束。结合前面的 main_tsx_entry 检查点，我们就能精确测量出所有 import 语句的总耗时。

在启动性能分析器（startupProfiler.ts）中，这个阶段被定义为：

typescript

// 文件: src/utils/startupProfiler.ts (第49-54行)

const PHASE_DEFINITIONS = {
  import_time: ['cli_entry', 'main_tsx_imports_loaded'],
  init_time: ['init_function_start', 'init_function_end'],
  settings_time: ['eagerLoadSettings_start', 'eagerLoadSettings_end'],
  total_time: ['cli_entry', 'main_after_run'],
} as const;

这些 import 语句加载了 Claude Code 运行所需的核心基础设施：Commander.js 命令行框架、React/Ink 终端 UI 框架、各种工具模块、权限系统、分析服务等。每条 import 都触发对应模块文件的同步求值，包括该模块自身的所有 import 依赖 -- 这就是为什么总耗时会达到约 135ms。

在导入阶段，代码还利用了 feature() 函数做条件加载（详见 3.3 节）：

typescript

// 文件: src/main.tsx (第74-81行)

// Dead code elimination: conditional import for COORDINATOR_MODE
const coordinatorModeModule = feature('COORDINATOR_MODE')
  ? require('./coordinator/coordinatorMode.js') : null;

// Dead code elimination: conditional import for KAIROS (assistant mode)
const assistantModule = feature('KAIROS')
  ? require('./assistant/index.js') : null;
const kairosGate = feature('KAIROS')
  ? require('./assistant/gate.js') : null;

这些条件 require() 在外部构建中会被编译器完全消除，从而避免加载内部功能模块的开销。

3.1.4 CLI 分发阶段（CLI Dispatch Phase）

导入完成后，main.tsx 导出的 main() 函数开始执行，进入 CLI 分发阶段。这个阶段的核心任务是：解析命令行参数、路由到正确的命令处理器、执行初始化。

typescript

// 文件: src/main.tsx (第585-606行，节选)

export async function main() {
  profileCheckpoint('main_function_start');

  // SECURITY: Prevent Windows from executing commands from current directory
  process.env.NoDefaultCurrentDirectoryInExePath = '1';

  // Initialize warning handler early to catch warnings
  initializeWarningHandler();
  process.on('exit', () => { resetCursor(); });
  process.on('SIGINT', () => {
    if (process.argv.includes('-p') || process.argv.includes('--print')) {
      return;
    }
    process.exit(0);
  });
  profileCheckpoint('main_warning_handler_initialized');
  // ...
}

main() 函数首先做几件紧急的事情：设置 Windows 安全环境变量、安装警告处理器、注册进程信号处理器。然后进入一系列的早期参数检测，最后通过 Commander.js 进行正式的命令路由。

Commander.js 的核心配置在 run() 函数中（第 884 行开始）。特别值得关注的是 preAction 钩子的设计：

typescript

// 文件: src/main.tsx (第905-967行，节选)

// Use preAction hook to run initialization only when executing a command,
// not when displaying help. This avoids the need for env variable signaling.
program.hook('preAction', async thisCommand => {
  profileCheckpoint('preAction_start');
  // Await async subprocess loads started at module evaluation (lines 12-20).
  // Nearly free — subprocesses complete during the ~135ms of imports above.
  await Promise.all([
    ensureMdmSettingsLoaded(),
    ensureKeychainPrefetchCompleted()
  ]);
  profileCheckpoint('preAction_after_mdm');
  await init();
  profileCheckpoint('preAction_after_init');
  // ...
  runMigrations();
  profileCheckpoint('preAction_after_migrations');

  void loadRemoteManagedSettings();
  void loadPolicyLimits();
  profileCheckpoint('preAction_after_remote_settings');
});

这里的 preAction 钩子是一个关键的设计决策。它确保初始化代码只在真正执行命令时运行，而不在显示帮助文本（--help）时运行。代码注释写得很清楚："This avoids the need for env variable signaling" -- 在 Commander.js 之前，很多 CLI 工具需要通过环境变量来判断是否应该跳过初始化。

preAction 钩子做的第一件事就是 await Promise.all([ensureMdmSettingsLoaded(), ensureKeychainPrefetchCompleted()]) -- 等待那些在副作用阶段启动的异步子进程完成。由于子进程在 import 期间已经并行运行了约 135ms，这个 await 通常是近乎即时完成的（代码注释："Nearly free -- subprocesses complete during the ~135ms of imports above"）。

下面用一张架构图来展示三个阶段的全貌：

时间线 ──────────────────────────────────────────────────────────────────────>

cli.tsx 入口
  |
  |  profileCheckpoint('cli_entry')
  |
  v
[快速路径检测] ──(--version)──> 直接输出版本号, return (零模块加载)
  |              ──(--daemon-worker)──> 加载 workerRegistry.js, return
  |              ──(bridge)──> 加载 bridgeMain.js, return
  |              ──(daemon)──> 加载 daemon/main.js, return
  |
  v
加载 main.tsx
  |
  |=== 副作用阶段 (Side-Effect Phase) ~2ms ===|
  | profileCheckpoint('main_tsx_entry')        |
  | startMdmRawRead()  ──> [plutil 子进程]  ───────────────┐ (并行运行)
  | startKeychainPrefetch() ──> [security 子进程 x2] ──────┤
  |                                                         |
  |=== 导入阶段 (Import Phase) ~135ms =========|            |
  | import { feature } from 'bun:bundle'       |            |
  | import Commander, chalk, React ...          |            |
  | import 业务模块 (tools, services, utils)    |            |
  | feature('COORDINATOR_MODE') ? require : null|            |
  | profileCheckpoint('main_tsx_imports_loaded')|            |
  |                                             |            |
  |=== CLI 分发阶段 (CLI Dispatch Phase) =======|            |
  | main() -> main_function_start              |            |
  |   initializeWarningHandler()               |            |
  |   检测 -p/--print, 确定交互模式            |            |
  |   eagerLoadSettings()                      |            |
  |   run() -> Commander 路由                  |            |
  |     preAction hook:                        |            |
  |       await Promise.all([MDM, Keychain]) <─────────────┘ (收割结果)
  |       await init()                          |
  |       runMigrations()                       |
  |       loadRemoteManagedSettings()           |
  |     action handler:                         |
  |       setup() -> showSetupScreens()         |
  |       renderAndRun() -> 交互式提示符        |
  |                                             |
  | profileCheckpoint('main_after_run')         |
  +=============================================+

3.2 并行预加载模式

并行预加载是 Claude Code 启动优化中最具创造性的设计之一。它利用了 JavaScript 模块系统的一个根本特性来获得"免费"的并行计算时间。

3.2.1 核心洞察：模块求值是同步的

JavaScript（无论是 Node.js 还是 Bun）的 import 语句在求值时是同步阻塞的。当引擎遇到一条 import 语句，它会：

解析模块路径
加载模块文件
同步执行模块中的所有顶层代码（包括该模块的 import 依赖）
返回导出的绑定

这意味着在 main.tsx 的 import 链执行期间（约 135ms），JavaScript 引擎的主线程完全被占用，无法执行任何用户代码。但操作系统的进程调度器不受此限制 -- 我们启动的子进程可以在另一个 CPU 核心上运行。

这就是 Claude Code 的核心洞察：在同步 import 开始之前启动异步子进程，让它们与 import 并行执行，import 结束后再收割结果。

3.2.2 MDM 配置读取

MDM（Mobile Device Management）是企业设备管理的标准协议。在 macOS 上，管理员通过配置描述文件（Profile）下发策略，这些策略存储在 plist 文件中。Claude Code 需要读取这些策略来执行企业级的安全和功能限制。

typescript

// 文件: src/utils/settings/mdm/rawRead.ts (第55-88行)

/**
 * Fire fresh subprocess reads for MDM settings and return raw stdout.
 * On macOS: spawns plutil for each plist path in parallel, picks first winner.
 * On Windows: spawns reg query for HKLM and HKCU in parallel.
 * On Linux: returns empty (no MDM equivalent).
 */
export function fireRawRead(): Promise<RawReadResult> {
  return (async (): Promise<RawReadResult> => {
    if (process.platform === 'darwin') {
      const plistPaths = getMacOSPlistPaths();

      const allResults = await Promise.all(
        plistPaths.map(async ({ path, label }) => {
          // Fast-path: skip the plutil subprocess if the plist file does not
          // exist. Spawning plutil takes ~5ms even for an immediate ENOENT,
          // and non-MDM machines never have these files.
          if (!existsSync(path)) {
            return { stdout: '', label, ok: false };
          }
          const { stdout, code } = await execFilePromise(PLUTIL_PATH, [
            ...PLUTIL_ARGS_PREFIX, path,
          ]);
          return { stdout, label, ok: code === 0 && !!stdout };
        }),
      );

      // First source wins (array is in priority order)
      const winner = allResults.find(r => r.ok);
      return {
        plistStdouts: winner
          ? [{ stdout: winner.stdout, label: winner.label }]
          : [],
        hklmStdout: null, hkcuStdout: null,
      };
    }
    // Windows: reg query ...
    // Linux: no-op
  })();
}

第3章 CLI 启动与性能优化 ​

引言：毫秒之战 ​

3.1 启动的三个阶段 ​

3.1.1 入口分层架构 ​

3.1.2 副作用阶段（Side-Effect Phase） ​

3.1.3 导入阶段（Import Phase） ​

3.1.4 CLI 分发阶段（CLI Dispatch Phase） ​

3.2 并行预加载模式 ​

3.2.1 核心洞察：模块求值是同步的 ​

3.2.2 MDM 配置读取 ​

第3章 CLI 启动与性能优化

引言：毫秒之战

3.1 启动的三个阶段

3.1.1 入口分层架构

3.1.2 副作用阶段（Side-Effect Phase）

3.1.3 导入阶段（Import Phase）

3.1.4 CLI 分发阶段（CLI Dispatch Phase）

3.2 并行预加载模式

3.2.1 核心洞察：模块求值是同步的

3.2.2 MDM 配置读取