Testing Reference¶
Overview¶
Famlab uses a multi-layered testing approach that combines traditional test scripts with documentation-embedded tests. The testing infrastructure ensures both code correctness and documentation accuracy.
Test Suite Components:
- TAP Test Helpers - Minimal TAP framework for shell unit tests
- Console Test Runner - Executes testable code blocks from markdown documentation
- Test Suite Runner - Orchestrates all test types and produces aggregated results
Running Tests:
mise run test
All tests output results in TAP (Test Anything Protocol) format.
About TAP
Test Anything Protocol (TAP) is a simple text-based description of test results. TAP is an established standard across many languages and testing frameworks. It produces human-readable output by default, while remaining machine-parseable. This makes it ideal for both development (human-in-the-loop) and CI/CD pipelines.
TAP on TAP
Each test script can be run individually, it will produce its own TAP output. The test suite runner collects all invocations, and aggregates them all as its own TAP output. This has the additional benefit that you can use, if you want, any TAP consumer on either an individual test script or on the suite as a whole.
Console-Test Format¶
Overview¶
Console-test blocks are markdown code blocks that execute commands and verify their output. They serve dual purposes: documenting CLI behavior and ensuring examples remain accurate.
Basic Syntax¶
Use the console-test
language identifier for code blocks:
```console-test
% echo "hello world"
hello world
```
Rules:
- Command lines: Start with
%
(percent-space prefix) - Expected output: All other lines
- Execution: Commands run sequentially in a shell
- Verification: Actual output must exactly match expected output
Multiline Commands¶
Continue commands across lines with trailing backslash:
```console-test
% echo \
"multi-line command"
multi-line command
```
Indentation: The continuation line's indentation (spaces or tabs) is preserved in the command.
Multiple Commands¶
Execute multiple commands in a single test block:
```console-test
% echo "first"
first
% echo "second"
second
```
Commands execute sequentially, joined with semicolons internally.
Multiple Test Blocks¶
Place multiple console-test blocks in a single document:
## Example 1
```console-test
% echo "test 1"
test 1
```
## Example 2
```console-test
% echo "test 2"
test 2
```
Each block is treated as a separate test case.
Empty Output¶
Commands that produce no output require an empty expected output section:
```console-test
% true
```
Test Execution¶
Command: tests/run-console-tests-in-docs.py [OPTIONS] FILE...
Options:
--help
- Show usage information--verbose
- Display executed commands in TAP output
Examples:
# Test single file
tests/run-console-tests-in-docs.py docs/reference/famlab-cli-architecture.md
# Test multiple files
tests/run-console-tests-in-docs.py docs/**/*.md
# Show commands being executed
tests/run-console-tests-in-docs.py --verbose docs/tutorials/getting-started.md
TAP Output:
TAP version 14
1..2
ok 1 - docs/example.md:10
ok 2 - docs/example.md:20
1..2
ok 1 - docs/example.md
Each test reports its file location and line number for easy debugging. See the TAP specification for output format details.
Usage Guidelines¶
When to Use Console-Tests:
- Tutorial step-by-step command examples
- Reference documentation showing CLI usage
- How-to guides with reproducible procedures
- Examples that should remain accurate as code evolves
When NOT to Use Console-Tests:
- Conceptual examples that shouldn't execute
- Commands requiring specific test environments
- Output containing variable data (timestamps, IDs, random values)
- Long-running or resource-intensive operations
- Commands with side effects outside test isolation
TAP Test Helpers¶
Overview¶
The TAP (Test Anything Protocol) helpers provide a minimal framework for writing shell unit tests in POSIX-compatible shell.
Location: tests/lib/tap-test-helpers.sh
Test Structure¶
Basic test file structure:
#!/bin/sh
# Load test framework
. "$(dirname "$0")/lib/tap-test-helpers.sh"
test_something_works() {
# Arrange
input="test value"
# Act
result=$(process_input "$input")
# Assert
expected="processed: test value"
assert_equals "$expected" "$result"
}
test_another_feature() {
some_command
assert_exit_code 0 $?
}
# Execute all tests
run_all_tests
Assertion Functions¶
assert_equals¶
Compare two strings for equality.
Signature: assert_equals expected actual [message]
Parameters:
expected
- Expected valueactual
- Actual valuemessage
- Optional custom failure message
Returns: 0 if equal, 1 if not equal
Example:
test_string_equality() {
result=$(echo "hello")
expected="hello"
assert_equals "$expected" "$result" "Echo output mismatch"
}
Failure Output:
# Echo output mismatch
# Expected:
# hello
# Actual:
# goodbye
assert_exit_code¶
Verify command exit codes.
Signature: assert_exit_code expected actual [message]
Parameters:
expected
- Expected exit code (integer)actual
- Actual exit code (usually$?
)message
- Optional custom failure message
Returns: 0 if codes match, 1 if not
Example:
test_command_succeeds() {
true
assert_exit_code 0 $?
}
test_command_fails() {
false
assert_exit_code 1 $?
}
assert_output_contains¶
Check if output contains a specific pattern.
Signature: assert_output_contains output pattern [message]
Parameters:
output
- Output string to searchpattern
- Pattern to find (literal string, not regex)message
- Optional custom failure message
Returns: 0 if pattern found, 1 if not
Example:
test_output_has_keyword() {
output=$(my_command)
assert_output_contains "$output" "Success" "Missing success message"
}
Failure Output:
# Missing success message
# Output:
# Command completed
# Pattern:
# Success
Test Lifecycle¶
setup()¶
Optional - Runs before each test function.
- Creates temporary directory at
$TEST_TMPDIR
- Exports
TEST_TMPDIR
for test use
Custom Setup:
setup() {
# Create temporary directory (default)
TEST_TMPDIR="$(mktemp -d)"
export TEST_TMPDIR
# Custom setup
cd "$TEST_TMPDIR" || return
echo "test data" > input.txt
}
teardown()¶
Optional - Runs after each test function.
- Removes
$TEST_TMPDIR
directory and contents
Custom Teardown:
teardown() {
# Custom cleanup
rm -f /tmp/my-test-file
# Default cleanup
if [ -n "$TEST_TMPDIR" ] && [ -d "$TEST_TMPDIR" ]; then
rm -rf "$TEST_TMPDIR"
fi
}
run_test¶
Execute a single test with full lifecycle.
Signature: run_test test_name
Executes test_name
function after initiating the lifecycle with setup
, and eventually cleaning it with teardown
.
Outputs a single TAP result line corresponding to the assertion results.
Example:
# Run specific test
run_test test_my_feature
run_all_tests¶
Auto-discover and execute all test functions starting with test_
.
Signature: run_all_tests
Runs each test function via run_test
.
Produces a complete TAP output (TAP version and plan).
Exits with code 0 if all tests pass, 1 if any fails.
Example:
# At end of test file
run_all_tests
Test Suite Architecture¶
Test Suite Runner¶
Location: tests/run-tests.sh
Purpose: Orchestrate all test types and produce aggregated TAP output.
Execution Flow¶
- Shell Tests: Discovers and runs all
tests/test-*.sh
files - Console Tests: Executes
run-console-tests-in-docs.py docs/**/*.md
- Aggregation: Combines results with proper TAP formatting
- Exit Code: Returns 0 if all pass, 1 if any fail
See the TAP specification for complete format details.
Running Specific Tests¶
The examples below are actually console-tests themselves!
Single shell test:
% tests/test-console-test-runner.sh
TAP version 14
1..19
ok 1 - test_help_flag
ok 2 - test_no_arguments_shows_error
ok 3 - test_nonexistent_file_shows_error
ok 4 - test_simple_console_test_passes
ok 5 - test_simple_console_test_fails
ok 6 - test_multiline_command
ok 7 - test_multiline_with_space_indent_command
ok 8 - test_multiline_with_tab_indent_command
ok 9 - test_multiple_commands_in_block
ok 10 - test_multiline_output
ok 11 - test_verbose_shows_command
ok 12 - test_verbose_shows_resolved_multiline_command
ok 13 - test_verbose_shows_resolved_multiline_and_space_indented_command
ok 14 - test_verbose_shows_resolved_multiline_and_tab_indented_command
ok 15 - test_multiple_blocks_in_file
ok 16 - test_ignores_other_code_blocks
ok 17 - test_multiple_files
ok 18 - test_failure_in_second_file
ok 19 - test_empty_output
Single documentation file:
% tests/run-console-tests-in-docs.py docs/index.md
TAP version 14
1..1
1..0
ok 1 - docs/index.md
Exit Codes¶
0
- All tests passed1
- One or more tests failed
Writing Tests¶
Shell Test Workflow¶
- Create test file:
tests/test-my-feature.sh
- Make executable:
chmod +x tests/test-my-feature.sh
- Load framework:
. "$(dirname "$0")/lib/tap-test-helpers.sh"
- Write test functions: Functions starting with
test_
- Call
run_all_tests
: At end of file - Run tests:
mise run test
orsh tests/test-my-feature.sh
Console-Test Workflow¶
- Identify examples: Find command examples in documentation
- Convert to console-test: Change
```shell
to```console-test
- Add command prefixes: Prefix commands with
%
- Add expected output: Place output after commands
- Test execution:
tests/run-console-tests-in-docs.py path/to/doc.md
Best Practices¶
Shell Tests:
- One assertion per test function
- Use descriptive test names (
test_feature_handles_empty_input
) - Test both success and failure cases
- Use
$TEST_TMPDIR
for file operations - Keep tests fast and isolated
- Avoid external dependencies
Console-Tests:
- Keep test blocks focused and self-contained
- Use descriptive surrounding text to explain what's being tested
- Ensure commands are idempotent when possible
- Avoid external dependencies (network, specific files, services)
- Use relative paths sparingly (tests run from repository root)