An embodied AI assistant inspired by Jarvis from Iron Man, running on the Reachy Mini robot with OpenAI Agents as its brain.
Latency is king. First audible response within 300-600ms (filler or real). Full answer streams after. People forgive dumb; they don't forgive slow.
Presence, not request/response. A continuous 30Hz "presence loop" runs independent of the LLM — breathing, micro-nods, gaze tracking. The robot feels alive even when silent.
Embodiment as policy, not library calls. The LLM outputs an "embodiment plan" (intent, prosody, motion primitives) with each response. A renderer maps those to physical behavior. No random "play happy" uncanny valley.
Barge-in. User can interrupt at any time. TTS stops immediately, new utterance is captured. This single behavior makes it feel 10x more real.
Guardrails. Destructive smart home actions use dry-run by default. Everything is audit-logged. Permissions model for sensitive operations.
Honest state broadcasting. It's obvious when Jarvis is listening vs idle vs muted, through posture and behavior, not just an LED.
┌───────────────────────────────────────────────────────────────────┐
│ PRESENCE LOOP (30Hz) │
│ Always running. Receives lightweight signals, outputs motion. │
│ │
│ Signals in: States: │
│ vad_energy ──┐ IDLE → breathing, drift │
│ doa_angle ──┤ LISTENING → orient, micro-nods, lean │
│ face_pos ──┼────────► THINKING → look away, processing anim │
│ llm_state ──┤ SPEAKING → stable gaze, intent motion │
│ embody_cmd ──┘ MUTED → privacy posture │
└───────────────────────────────────────────────────────────────────┘
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Audio Input │ │ Agent Brain │ │ Audio Output │
│ │ │ │ │ │
│ Mic → VAD ──────┼────►│ Agent SDK │────►│ Stream TTS │
│ ↓ │ │ + MCP tools: │ │ (ElevenLabs) │
│ Whisper STT ────┼────►│ embody/robot │ │ │
│ │ │ smart_home/* │ │ Barge-in: │
│ Barge-in: ◄─────┼─────│ todoist/* │◄────│ VAD interrupts │
│ stop TTS │ │ memory/* │ │ playback │
└─────────────────┘ └──────────────────┘ └─────────────────┘
┌──────────────────┐ ┌──────────────────┐
│ Face Tracker │ │ Audit Log │
│ YOLOv8 → face │────►│ ~/.jarvis/audit │
│ position signal │ │ .jsonl │
│ │ │ (rotating) │
└──────────────────┘ └──────────────────┘
| Component | Technology | Purpose |
|---|---|---|
| Presence Loop | Custom 30Hz controller | Continuous micro-behaviors, state machine |
| Brain | OpenAI Agents SDK + custom tools | Reasoning, conversation, embodiment plans |
| Speech-to-Text | Whisper (local, faster-whisper) |
Transcribe user speech |
| Text-to-Speech | ElevenLabs API (streaming) | Jarvis voice, sentence-level streaming |
| VAD | Silero VAD | End-of-utterance + barge-in detection |
| Face Tracking | YOLOv8 (ultralytics) |
Face position → presence loop signal |
| Robot Control | Reachy Mini SDK | Head (6DOF), body, antennas, emotions |
| Smart Home | Home Assistant REST API | Lights, climate, media — with dry-run + audit |
cd jarvis
uv sync
cp .env.example .env
# Fill in: OPENAI_API_KEY, ELEVENLABS_API_KEY
# Optional: HASS_URL, HASS_TOKEN for smart home
# Optional: HOME_PERMISSION_PROFILE=readonly (state only) or control (default)
# Optional: HOME_REQUIRE_CONFIRM_EXECUTE=true (require confirm=true on all executes)
# Optional: HOME_CONVERSATION_ENABLED=true (enable HA conversation intent tool)
# Optional: HOME_CONVERSATION_PERMISSION_PROFILE=readonly|control (default readonly)
# Optional: SAFE_MODE_ENABLED=true (force mutating actions into restricted/dry-run behavior)
# Optional: TODOIST_API_TOKEN / TODOIST_PROJECT_ID / TODOIST_PERMISSION_PROFILE
# Optional: NOTION_API_TOKEN / NOTION_DATABASE_ID (integration_hub notion notes backend)
# Optional: TODOIST_TIMEOUT_SEC=10.0 / PUSHOVER_TIMEOUT_SEC=10.0
# Optional: PUSHOVER_API_TOKEN / PUSHOVER_USER_KEY / NOTIFICATION_PERMISSION_PROFILE
# Optional: NUDGE_POLICY=interrupt|defer|adaptive / NUDGE_QUIET_HOURS_START / NUDGE_QUIET_HOURS_END
# Optional: EMAIL_SMTP_HOST / EMAIL_FROM / EMAIL_DEFAULT_TO / EMAIL_PERMISSION_PROFILE / EMAIL_TIMEOUT_SEC
# Optional: WEATHER_UNITS=metric|imperial / WEATHER_TIMEOUT_SEC
# Optional: WEBHOOK_ALLOWLIST=example.com,api.example.com / WEBHOOK_AUTH_TOKEN / WEBHOOK_TIMEOUT_SEC
# Optional: SLACK_WEBHOOK_URL / DISCORD_WEBHOOK_URL
# Optional: PERSONA_STYLE=terse|composed|friendly|jarvis / BACKCHANNEL_STYLE=quiet|balanced|expressive
# Optional: IDENTITY_ENFORCEMENT_ENABLED / IDENTITY_DEFAULT_USER / IDENTITY_DEFAULT_PROFILE
# Optional: IDENTITY_USER_PROFILES / IDENTITY_TRUSTED_USERS
# Optional: IDENTITY_REQUIRE_APPROVAL / IDENTITY_APPROVAL_CODE
# Optional: PLAN_PREVIEW_REQUIRE_ACK=true (require preview_token before risky execute tools)
# Optional: MEMORY_RETENTION_DAYS / AUDIT_RETENTION_DAYS (0 disables pruning)
# Optional: MEMORY_PII_GUARDRAILS_ENABLED=true|false
# Optional: MEMORY_ENCRYPTION_ENABLED / AUDIT_ENCRYPTION_ENABLED / JARVIS_DATA_KEY
# Optional: WAKE_MODE / WAKE_CALIBRATION_PROFILE / WAKE_WORDS / WAKE_WORD_SENSITIVITY / VOICE_TIMEOUT_PROFILE
# Optional: STT_FALLBACK_ENABLED / WHISPER_MODEL_FALLBACK / TTS_FALLBACK_TEXT_ONLY
# Optional: OPENAI_ROUTER_MODEL / ROUTER_TIMEOUT_SEC / POLICY_ROUTER_MIN_CONFIDENCE
# Optional: INTERRUPTION_ROUTER_TIMEOUT_SEC / INTERRUPTION_RESUME_MIN_CONFIDENCE
# Optional: SEMANTIC_TURN_ENABLED / SEMANTIC_TURN_ROUTER_TIMEOUT_SEC / SEMANTIC_TURN_MIN_CONFIDENCE
# Optional: SEMANTIC_TURN_EXTENSION_SEC / SEMANTIC_TURN_MAX_TRANSCRIPT_CHARS
# Optional: MODEL_FAILOVER_ENABLED / MODEL_SECONDARY_MODE / WATCHDOG_* / TURN_TIMEOUT_ACT_SEC / STARTUP_STRICT
# Optional: OPERATOR_SERVER_ENABLED / OPERATOR_SERVER_HOST / OPERATOR_SERVER_PORT / OPERATOR_AUTH_MODE / OPERATOR_AUTH_TOKEN
# Optional: WEBHOOK_INBOUND_ENABLED / WEBHOOK_INBOUND_TOKEN
# Optional: RECOVERY_JOURNAL_PATH / DEAD_LETTER_QUEUE_PATH (interrupted-action + failed-outbound journals)
# Optional: EXPANSION_STATE_PATH / RELEASE_CHANNEL_CONFIG_PATH (roadmap + release-channel persistence/check config)
# Optional: NOTES_CAPTURE_DIR / QUALITY_REPORT_DIR (integration capture + report artifact locations)
# Optional: OBSERVABILITY_* (DB/state/event paths, burst threshold, snapshot interval)
# Optional: SKILLS_ENABLED / SKILLS_DIR / SKILLS_ALLOWLIST / SKILLS_REQUIRE_SIGNATURE / SKILLS_SIGNATURE_KEY
Smart home safety defaults:
lock, alarm_control_panel, cover, climate) require confirm=true when dry_run=false.HOME_PERMISSION_PROFILE=readonly disables mutating smart_home actions but keeps smart_home_state.HOME_REQUIRE_CONFIRM_EXECUTE=true enforces confirm=true for all non-dry-run smart_home actions.SAFE_MODE_ENABLED=true keeps mutating actions in restricted mode (dry-run where supported, blocked otherwise).PLAN_PREVIEW_REQUIRE_ACK=true enforces a two-step preview+ack flow (preview_token) before mutating medium/high-risk actions.preview_only=true to get a plan preview token.preview_token=<token> before token expiry.NUDGE_POLICY controls due-reminder interrupts: interrupt, defer, or adaptive (quiet-window aware).docs/operations/home-control-policy.md.docs/operations/integration-policy.md.docs/operations/trust-policy.md.brief for urgent/short-answer requests,deep for explicit detailed walkthrough requests,normal otherwise.answer for direct questions,act for explicit action requests,clarify when an action request is ambiguous (it/that/this targets).cautious for volatile/time-sensitive prompts (latest, today, right now),calibrated for estimate/prediction prompts,direct for stable factual prompts.default, quiet_room, noisy_room, tv_room, far_field):the bedroom or and in the office inherit prior unresolved action targets.turn on the kitchen lights) remains a new request.voice_attention.stt_diagnostics reports confidence score/band, model source, fallback usage, and transcript quality signals.I may have misheard you as ... and accepts either confirm or an immediate corrected phrase.commit vs wait, then applies a short extension window before finalizing the turn.replace, resume, or clarify with fail-closed fallback to replace.set_voice_profile / clear_voice_profile / list_voice_profiles manage per-user verbosity, confirmations, pace, and tone.social: allows one brief dry-wit line where appropriate,task: stays precise and execution-focused,safety: disables humor and favors explicit confirmation language.HOME_CONVERSATION_ENABLED=trueHOME_CONVERSATION_PERMISSION_PROFILE=controlconfirm=truehome_assistant_todo (list|add|remove) for native HA to-do entitieshome_assistant_timer (state|start|pause|cancel|finish) for HA timer entitieshome_assistant_area_entities for area-aware entity resolutionmedia_control for simplified media_player actions (play, pause, volume_set, etc.)system_status (includes schema_version)system_status.scorecard (unified latency/reliability/initiative/trust scoring)system_status.observability.latency_dashboards (p50/p95/p99 total-turn latency with intent/tool-mix/wake-mode breakdowns)system_status.observability.policy_decision_analytics (allow/deny reason counts by tool/user/status)system_status.turn_timeouts (listen/think/speak/act timeout budgets)system_status.integrations.*.circuit_breaker (open/remaining/failure state per integration)system_status.recovery_journal (interrupted-action reconciliation summary)system_status.dead_letter_queue (failed outbound delivery queue with replay status)system_status.expansion (proactive, trust, orchestration, planner, quality, embodiment, integration roadmap feature snapshot)jarvis_scorecard (standalone scorecard payload for dashboards and alerts)system_status_contract (stable required-field contract)preferences, people, projects, household_rules (tagged as scope:<name>).memory_search and memory_recent apply explicit scope policy (scopes=...) and expose scope=..., confidence=..., source=..., and trail=id/source/created_at.memory_status includes confidence_model and scope_policy metadata for retrieval transparency.decision_outcome, decision_reason, and decision_explanation./api/audit.OPERATOR_AUTH_MODE=off|token|session controls operator auth strategy:off: no auth challenge (highest risk)token: per-request bearer/header tokensession: login endpoint creates short-lived browser session cookietoken when OPERATOR_AUTH_TOKEN is set, otherwise offOPERATOR_AUTH_TOKEN when binding OPERATOR_SERVER_HOST to a non-loopback interface.token mode protects /api/*, /metrics, and /events via X-Operator-Token or Authorization: Bearer <token>.session mode protects the same endpoints via POST /api/session/login + jarvis_operator_session cookie./) remains reachable and supports token entry for browser-based API calls.GET /api/control-schema returns action/payload requirements for automation clients.GET /api/conversation-trace returns live turn flow/tool/policy/latency trace rows used by the dashboard panel./api/status now includes episodic_timeline snapshots for recent important turns/actions./api/status now includes operator_controls with active_control_preset, available presets, and the current runtime profile snapshot./api/status now includes runtime_invariants (last check, total violations, auto-heals, recent entries).apply_control_preset (quiet_hours, demo_mode, maintenance_mode)export_runtime_profile / import_runtime_profileset_sleeping (sleeping=true|false).preview_personality, commit_personality_preview, rollback_personality_preview./api/operator-actions now records tamper-evident chained signatures (previous_signature, signature, signature_alg).docs/operations/release-checklist.md.docs/operations/security-maintenance.md.docs/operations/error-taxonomy.md.docs/operations/observability-runbook.md.docs/operations/personality-research.md.docs/operations/proactive-preference-loop.md.make test-fault-profiles (runs quick, network, storage, contract)Fault Profiles workflow runs weekly with per-profile artifactsdocs/operations/skills-development.md.docs/operations/provenance-verification.md.docs/operations/incident-response.md../scripts/release_acceptance.sh fast|full../scripts/check_release_channel.py --channel dev|beta|stable../scripts/generate_quality_report.py --output-dir .artifacts/quality --markdown --compare-with .artifacts/quality/weekly-quality-<previous>.json../scripts/run_eval_dataset.py docs/evals/assistant-contract.json --strict --min-pass-rate 1.0 --max-failed 0../scripts/run_router_policy_eval.py docs/evals/router-policy-contract.json --strict --min-pass-rate 1.0 --max-failed 0../scripts/run_interruption_route_eval.py docs/evals/interruption-route-contract.json --strict --min-pass-rate 1.0 --max-failed 0.replace|resume|clarify routing, fallback behavior, and continuation metadata integrity../scripts/run_trace_trajectory_eval.py docs/evals/trajectory-trace-contract.json --strict --min-pass-rate 1.0 --max-failed 0../scripts/run_autonomy_cycle_eval.py docs/evals/autonomy-cycle-contract.json --strict --min-pass-rate 1.0 --max-failed 0../scripts/jarvis_readiness.sh fast|full (or make readiness)../scripts/bootstrap.sh.docker compose up --build (simulation/no-vision default).deploy/home-assistant-addon.TODOIST_PERMISSION_PROFILE=readonly|controlreadonly allows todoist_list_tasks and denies todoist_add_taskcontrol allows both toolsTODOIST_TIMEOUT_SEC controls request timeout (default 10.0)todoist_list_tasks supports format=short|verbose (default short)NOTIFICATION_PERMISSION_PROFILE=off|allowoff denies pushover_notify, slack_notify, and discord_notifyallow enables all channel notification toolsPUSHOVER_TIMEOUT_SEC controls request timeout (default 10.0)email_send requires confirm=true and EMAIL_PERMISSION_PROFILE=controlemail_summary shows recent outbound email metadataEMAIL_SMTP_HOST, EMAIL_FROM, EMAIL_DEFAULT_TOslack_notify uses SLACK_WEBHOOK_URLdiscord_notify uses DISCORD_WEBHOOK_URLtimer_create, timer_list, timer_cancelreminder_create, reminder_list, reminder_completereminder_notify_duecalendar_events, calendar_next_eventweather_lookup (Open-Meteo backend; WEATHER_UNITS=metric|imperial)webhook_trigger enforces https + WEBHOOK_ALLOWLIST domain policyWEBHOOK_AUTH_TOKENapproval_code or a trusted requester with approved=truedead_letter_list to inspect queue statedead_letter_replay to retry specific or filtered entriesproactive_assistant):briefing, anomaly_scan, routine_suggestions, follow_through, event_digestmemory_governance):identity_trust):home_orchestrator):automation_create, automation_apply, automation_rollback, automation_status (supports dry-run diff previews)skills_governance):planner_engine):autonomy_schedule, autonomy_checkpoint, autonomy_replan, autonomy_cycle, autonomy_statusautonomy_statusquality_evaluator):embodiment_presence):integration_hub):release_channel_get, release_channel_set, release_channel_check.env.example to .env, then set required keys: OPENAI_API_KEY and ELEVENLABS_API_KEY.HASS_URL and HASS_TOKENPUSHOVER_API_TOKEN and PUSHOVER_USER_KEYHOME_PERMISSION_PROFILE=readonly (recommended first boot)TODOIST_PERMISSION_PROFILE=readonlyNOTIFICATION_PERMISSION_PROFILE=offmake checkmake test-faultsuv run python -m jarvis --sim --no-visiondry_run=true smart-home request first before any live execute.# Full Jarvis experience
uv run python -m jarvis
# Without face tracking (audio only)
uv run python -m jarvis --no-vision
# Text output instead of TTS (debugging)
uv run python -m jarvis --no-tts
# Simulation mode (no robot connected)
uv run python -m jarvis --sim
# Verbose logging
uv run python -m jarvis --debug
# Create a backup bundle (memory, audit logs, runtime state, operator settings)
uv run python -m jarvis --backup ~/.jarvis/backups/jarvis-$(date +%Y%m%d-%H%M%S).tar.gz
# Restore from a backup bundle (overwrite existing files)
uv run python -m jarvis --restore ~/.jarvis/backups/jarvis-20260227-120000.tar.gz --force
# Open operator console
open http://127.0.0.1:8765
# Full lint + full test suite
make check
# Fast local regression pass
make test-fast
# Simulation-focused validation pass
make test-sim
# Fault-injection oriented subset (network, HTTP, summary, and storage taxonomy)
make test-faults
# Soak/stability subset
make test-soak
# Extended soak profile (simulation + fault profiles + checkpoint/retry validation)
make test-soak-extended
# Personality A/B drift checks (brevity + confirmation friction)
make test-personality
# Deployment/security gate (lint + tests + fault subset + workflow pin checks)
make security-gate
# Combined release-readiness gate (lint + acceptance + release checks + strict eval)
make readiness
# Marker-based subsets
uv run pytest -q -m fast
uv run pytest -q -m fault
uv run pytest -q -m slow
Equivalent scripts are available under scripts/:
scripts/check.shscripts/test_fast.shscripts/test_sim.shscripts/test_faults.shscripts/test_soak.shscripts/test_soak_extended.shscripts/run_soak_profile.pyscripts/test_personality.shscripts/personality_ab_eval.pyscripts/security_gate.shscripts/jarvis_readiness.shCI runs the same lint + test gates on every push and pull request via
ci.yml.
Workflow linting and YAML hygiene run via
workflow-sanity.yml.
Nightly soak coverage is scheduled in
nightly-soak.yml.
Readiness-gate automation is scheduled/on-demand in
jarvis-readiness.yml.
| Workflow | Intent | Failure routing (first stop) |
|---|---|---|
ci.yml / lint |
Static checks (ruff) |
src/, tests/, and Python style issues in the failing path |
ci.yml / tests |
Full regression (pytest) |
Failing test module and corresponding implementation area |
ci.yml / faults |
Fault-injection taxonomy + error-path contract | tests/test_tools_services.py fault tests and src/jarvis/tools/services.py normalization paths |
workflow-sanity.yml |
Workflow hygiene (actionlint, tabs, script executability/shebang) |
.github/workflows/* and scripts/*.sh |
shellcheck.yml |
Shell script linting | scripts/*.sh syntax/quoting/safety |
security.yml |
Scheduled/PR CodeQL scan | Security findings in SARIF report; route by file ownership |
nightly-soak.yml |
Long-run stability signal | tests/test_main_audio.py -k soak, audio/runtime regressions |
jarvis/
├── pyproject.toml
├── .env.example
├── ~/.jarvis/audit.jsonl # Auto-created audit log (runtime path)
├── src/
│ └── jarvis/
│ ├── __main__.py # Entry point + conversation loop
│ ├── config.py # Settings & env vars
│ ├── brain.py # OpenAI Agents SDK orchestrator
│ ├── observability.py # Telemetry store + metrics export
│ ├── operator_server.py # Local operator dashboard/API
│ ├── skills.py # Local skill discovery + lifecycle
│ ├── presence.py # 30Hz presence loop (the soul)
│ ├── tools/
│ │ ├── robot.py # embody, play_emotion, play_dance
│ │ ├── services.py # shared runtime/helpers + MCP tool registry
│ │ └── services_domains/
│ │ ├── home.py # home_orchestrator domain handler
│ │ ├── planner.py # planner_engine domain handler
│ │ ├── integrations.py # integration_hub domain handler
│ │ ├── comms.py # channel/email/todoist/pushover handlers
│ │ ├── governance.py # skills_governance + quality_evaluator + embodiment_presence
│ │ └── trust.py # proactive_assistant + memory_governance + identity_trust
│ ├── audio/
│ │ ├── vad.py # Silero voice activity detection
│ │ ├── stt.py # faster-whisper transcription
│ │ └── tts.py # ElevenLabs synthesis
│ ├── vision/
│ │ └── face_tracker.py # YOLOv8 detection → presence signals
│ └── robot/
│ └── controller.py # Reachy Mini SDK wrapper