Skip to content

CLI Reference

Complete reference for the ProTest command-line interface.

Synopsis

protest <command> [options] <target>

Commands

Command Description
run Run tests
eval Run evaluations
live Start live reporter server
tags list List tags in a session

protest run

Run tests from a session.

Syntax

protest run <target> [options]

Target Format

<module>:<session>[::SuiteName[::NestedSuite]]
Part Required Description
module Yes Python module path
session Yes Name of the ProTestSession variable
::SuiteName No Filter to specific suite

Examples:

protest run tests:session              # Run all tests
protest run myapp.tests:session        # Module in package
protest run tests:session::API         # Only API suite
protest run tests:session::API::Users  # Nested suite

Options

Filtering Options

Option Short Description
::SuiteName - Run only tests in specified suite (part of target)
--keyword -k Run tests matching keyword (substring match)
--tag -t Run tests with specified tag
--no-tag - Exclude tests with specified tag
--last-failed --lf Run only tests that failed last time
--cache-clear - Clear the test cache before running
--collect-only - List tests without running them

Execution Options

Option Short Description Default
--concurrency -n Number of parallel workers 1
--exitfirst -x Stop on first failure false
--no-capture -s Show stdout/stderr during tests false
--app-dir - Directory containing the module .

Output Options

Option Description
--no-color Disable colors (plain ASCII output)
--no-log-file Disable writing to .protest/last_run.log
--ctrf-output PATH Output CTRF JSON report to PATH

Filtering in Detail

Suite Filter (::SuiteName)

The suite filter is part of the target, not a separate option. It filters tests to only those belonging to the specified suite and its children.

# Given this structure:
# session
# ├── API (suite)
# │   ├── Users (nested suite)
# │   │   ├── test_list_users
# │   │   └── test_create_user
# │   └── test_api_health
# └── test_standalone

# Run all API tests (including Users)
protest run tests:session::API
# Runs: test_api_health, test_list_users, test_create_user

# Run only Users tests
protest run tests:session::API::Users
# Runs: test_list_users, test_create_user

# Standalone tests are excluded when using suite filter

Standalone Tests

Tests registered directly on the session (not in any suite) are excluded when using a suite filter.

Keyword Filter (-k)

Match tests by substring in their name. Multiple -k flags use OR logic.

# Match test names containing "login"
protest run tests:session -k login
# Matches: test_login, test_login_failed, test_user_login

# Multiple keywords (OR logic)
protest run tests:session -k login -k logout
# Matches: test_login, test_logout, test_login_failed

# Works with parameterized tests (matches case IDs too)
protest run tests:session -k admin
# Matches: test_user[admin], test_permissions[admin-read]

Case Sensitivity

Keyword matching is case-sensitive. Use exact casing from your test names.

Tag Filter (-t, --no-tag)

Filter by tags declared on tests, suites, or fixtures.

# Include tests with tag
protest run tests:session -t unit

# Multiple tags (OR logic)
protest run tests:session -t unit -t integration

# Exclude tests with tag
protest run tests:session --no-tag slow

# Combine include and exclude
protest run tests:session -t api --no-tag flaky

Tags are inherited:

  • Tests inherit tags from their parent suite
  • Tests inherit tags from fixtures they depend on (transitively)

Last Failed (--lf)

Re-run only tests that failed in the previous run.

# First run - some tests fail
protest run tests:session
# Output: 8/10 passed, 2 failed

# Second run - only failed tests
protest run tests:session --lf
# Runs only the 2 failed tests

Behavior with Other Filters

When combined with other filters, --lf returns the intersection:

  • --lf -t slow → failed tests that have tag "slow"
  • If no failed tests match the filter, 0 tests run (no fallback)
# Clear cache to run all tests again
protest run tests:session --cache-clear

Combining Filters

All filters compose as intersection (AND logic between filter types).

# Suite AND keyword
protest run tests:session::API -k users
# Tests in API suite with "users" in name

# Suite AND keyword AND tag
protest run tests:session::API -k users -t slow
# Tests in API suite, with "users" in name, tagged "slow"

# Suite AND keyword AND tag AND last-failed
protest run tests:session::API -k users -t slow --lf
# Failed tests in API suite, with "users" in name, tagged "slow"

Filter evaluation order:

Collected tests
    → Suite filter (::SuiteName)
    → Keyword filter (-k)
    → Tag filter (-t, --no-tag)
    → Cache filter (--lf)
    → Final test list

Execution Examples

Development Workflow

# Run all tests
protest run tests:session

# Quick check - stop on first failure
protest run tests:session -x

# Re-run failures
protest run tests:session --lf

# Re-run failures, stop on first
protest run tests:session --lf -x

CI/CD Workflow

# Full test suite, parallel
protest run tests:session -n 4

# Unit tests only
protest run tests:session -t unit -n 4

# Integration tests (might need sequential)
protest run tests:session -t integration

# Generate CTRF report for CI tools
protest run tests:session -n 4 --ctrf-output ctrf-report.json

Debugging

# See output from tests
protest run tests:session -s

# Run specific test
protest run tests:session -k test_specific_function

# List what would run
protest run tests:session::API -k login --collect-only

Working on a Feature

# Focus on one suite during development
protest run tests:session::API::Users -x

# Run related tests
protest run tests:session -k user -x

# Check everything still works
protest run tests:session

protest eval

Run evaluations from a session.

protest eval is the eval-suite counterpart of protest run. It shares the same target format, filters, capture flags and reporting options as run; the differences are listed below.

Syntax

protest eval <target> [options]

Options

protest eval accepts every option from protest run (see above: -n/--concurrency, --collect-only, -x/--exitfirst, -s/--no-capture, -q/--quiet, -v/--verbose, --show-logs, -t/--tag, --no-tag, -k/--keyword, --lf, --cache-clear, --no-color, --ctrf-output, --no-log-file, --app-dir), plus one eval-only flag:

Option Description Default
--show-output Print inputs / output / expected for every case (failed cases always print these). off

Examples

# Run all evals in a session
protest eval evals.session:session

# One specific suite
protest eval evals.session:session::helpdesk_struct

# One ticket by name
protest eval evals.session:session -k T001

# All cases tagged "cat:hardware"
protest eval evals.session:session --tag cat:hardware

# Re-run only the cases that failed last time
protest eval evals.session:session --lf

# Show the input/output of every case (not just failures)
protest eval evals.session:session --show-output

Output

Each case prints one line:

✓   classify_ticket_struct[T011] (2ms) category_check.allowed=✓ summary_check.recall=1.00 …

After every suite, an aggregate-stats table summarizes the Metric fields across cases (mean / p50 / p5 / p95). Verdict and Reason fields don't appear in this table - only numeric Metric fields do.

Per-case markdown artifacts are written to .protest/results/<suite>_<timestamp>/<case-id>.md, with the full input, output, expected, and per-evaluator scores.


Run history (recorded)

Every run / eval appends one entry to .protest/history.jsonl (schema-versioned JSONL). History is recorded from the first run so the data accumulates over time; dedicated commands to browse and compare runs land in a future release.

Per-case eval detail (input, output, expected, evaluator scores) is written to .protest/results/<suite>_<timestamp>/<case-id>.md.


protest live

Start a persistent live reporter server for real-time test visualization.

Syntax

protest live [options]

Options

Option Short Description Default
--port -p Port to listen on 8765

Example

# Start the live server
protest live

# Start on a custom port
protest live -p 9000

The live server stays running and displays test results in real-time as you run tests in another terminal.


protest tags list

List tags declared in a session.

Syntax

protest tags list <target> [options]

Options

Option Short Description
--recursive -r Show effective tags per test
--app-dir - Directory containing the module

Examples

# List all declared tags
protest tags list tests:session
# Output:
# api
# database
# integration
# slow
# unit

# Show tags per test (includes inherited)
protest tags list tests:session -r
# Output:
# Effective tags for 3 test(s):
#
#   API::test_api_call
#     tags: api, integration
#
#   API::test_db_query
#     tags: database, slow
#
#   test_simple
#     tags: unit

Exit Codes

Code Meaning
0 All tests passed (or no tests collected)
1 One or more tests failed or errored

Environment

Cache Location

Test results are cached in .protest/cache.json relative to the current directory.

# View cache location
ls .protest/

# Clear cache
protest run tests:session --cache-clear
# Or manually: rm -rf .protest/

Module Resolution

By default, ProTest looks for modules in the current directory. Use --app-dir to specify a different location:

# Module in src/ directory
protest run myapp.tests:session --app-dir src

# Module in project root (default)
protest run tests:session