Skip to content

Commit c79951d

Browse files
authored
docs: document system health checks on run submission (#342)
1 parent e3d8851 commit c79951d

6 files changed

Lines changed: 268 additions & 5 deletions

File tree

README.md

Lines changed: 65 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -327,6 +327,21 @@ QuPath integration provides the most powerful way to visualize and interact with
327327

328328
**Congratulations!** You have successfully downloaded a public dataset, submitted an Atlas H&E-TME analysis run, and learned how to access and inspect your results.
329329

330+
### System Health Checks
331+
332+
The Launchpad automatically monitors system health before allowing run submissions. If the system is unhealthy (e.g., network connectivity issues, authentication problems, or platform unavailability), the submission workflow is blocked:
333+
334+
- A tooltip displays "System is unhealthy, you cannot prepare a run at this time."
335+
- The "Next" button in the application workflow is disabled.
336+
- The health status is shown in the footer bar at the bottom of the Launchpad.
337+
338+
To resolve health issues:
339+
340+
1. Check the health status indicator in the footer bar
341+
2. Click "Info and Settings" in the menu to see detailed health information
342+
3. Verify your network connection and authentication status
343+
4. Check the [Aignostics Platform Status](https://status.aignostics.com) page
344+
330345
### Advanced Setup: Extensions
331346

332347
> 💡 The Launchpad features a growing ecosystem of extensions that seamlessly integrate with standard digital pathology tools. To use the Launchpad with all available extensions, run `uvx --from "aignostics[qupath,marimo]" aignostics launchpad`. Currently available extensions are:
@@ -400,6 +415,28 @@ Check out our
400415
[CLI reference documentation](https://aignostics.readthedocs.io/en/latest/cli_reference.html)
401416
to learn about all commands and options available.
402417

418+
### System Health Checks
419+
420+
The CLI automatically checks system health before uploading slides or submitting runs. If the system is unhealthy, the operation is blocked and an error message is displayed:
421+
422+
```
423+
Error: Platform is not healthy: <reason>. Aborting.
424+
```
425+
426+
To override this behavior (not recommended for production use), add the `--force` flag:
427+
428+
```shell
429+
uvx aignostics application run upload he-tme metadata.csv --force
430+
uvx aignostics application run submit he-tme metadata.csv --force
431+
uvx aignostics application run execute he-tme metadata.csv data/ --force
432+
```
433+
434+
To manually check system health before running commands:
435+
436+
```shell
437+
uvx aignostics system health
438+
```
439+
403440
## Python Library: Call the Aignostics Platform API from your Python scripts
404441

405442
The Python SDK includes the *Aignostics Python Library* for integration with your Python codebase.
@@ -465,6 +502,30 @@ and read the
465502
[client reference documentation](https://aignostics.readthedocs.io/en/latest/lib_reference.html)
466503
to learn about all classes and methods.
467504

505+
### System Health Checks
506+
507+
The low-level Python SDK does **not** perform automated health checks before operations. If health verification is required for your use case, you should implement checks in your application logic:
508+
509+
```python
510+
from aignostics import platform
511+
from aignostics.system import Service as SystemService
512+
513+
# Check system health before submitting runs
514+
health = SystemService().health()
515+
if not health:
516+
raise RuntimeError(f"System is unhealthy: {health.reason}")
517+
518+
# Proceed with run submission
519+
client = platform.Client()
520+
run = client.runs.submit(...)
521+
```
522+
523+
This design gives you full control over health check behavior, allowing you to:
524+
525+
- Implement custom retry logic for transient failures
526+
- Log health status for monitoring and debugging
527+
- Gracefully handle unhealthy states in your application
528+
468529
### Example Notebooks: Interact with the Aignostics Platform from your Python Notebook environment
469530

470531
> [!IMPORTANT]
@@ -856,9 +917,12 @@ Architectural style for web services that the Aignostics Platform API follows, e
856917
**Self-signed URLs**
857918
Secure URLs with embedded authentication that allow the platform to access user data without exposing credentials.
858919

859-
**SVS**
920+
**SVS**
860921
Aperio ScanScope Virtual Slide format, commonly used for whole slide images and supported by the platform.
861922

923+
**System Health Check**
924+
Automated verification that the SDK and Aignostics Platform are operational before critical operations. The Launchpad blocks run submission when unhealthy (no override available for regular users). The CLI blocks uploads and submissions by default but allows override with `--force`. The Python Library does not perform automatic health checks, giving developers full control over health verification logic.
925+
862926
### T
863927

864928
**Test Application**

docs/partials/README_glossary.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -164,9 +164,12 @@ Architectural style for web services that the Aignostics Platform API follows, e
164164
**Self-signed URLs**
165165
Secure URLs with embedded authentication that allow the platform to access user data without exposing credentials.
166166

167-
**SVS**
167+
**SVS**
168168
Aperio ScanScope Virtual Slide format, commonly used for whole slide images and supported by the platform.
169169

170+
**System Health Check**
171+
Automated verification that the SDK and Aignostics Platform are operational before critical operations. The Launchpad blocks run submission when unhealthy (no override available for regular users). The CLI blocks uploads and submissions by default but allows override with `--force`. The Python Library does not perform automatic health checks, giving developers full control over health verification logic.
172+
170173
### T
171174

172175
**Test Application**

docs/partials/README_main.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -305,6 +305,21 @@ QuPath integration provides the most powerful way to visualize and interact with
305305

306306
**Congratulations!** You have successfully downloaded a public dataset, submitted an Atlas H&E-TME analysis run, and learned how to access and inspect your results.
307307

308+
### System Health Checks
309+
310+
The Launchpad automatically monitors system health before allowing run submissions. If the system is unhealthy (e.g., network connectivity issues, authentication problems, or platform unavailability), the submission workflow is blocked:
311+
312+
- A tooltip displays "System is unhealthy, you cannot prepare a run at this time."
313+
- The "Next" button in the application workflow is disabled.
314+
- The health status is shown in the footer bar at the bottom of the Launchpad.
315+
316+
To resolve health issues:
317+
318+
1. Check the health status indicator in the footer bar
319+
2. Click "Info and Settings" in the menu to see detailed health information
320+
3. Verify your network connection and authentication status
321+
4. Check the [Aignostics Platform Status](https://status.aignostics.com) page
322+
308323
### Advanced Setup: Extensions
309324

310325
> 💡 The Launchpad features a growing ecosystem of extensions that seamlessly integrate with standard digital pathology tools. To use the Launchpad with all available extensions, run `uvx --from "aignostics[qupath,marimo]" aignostics launchpad`. Currently available extensions are:
@@ -378,6 +393,28 @@ Check out our
378393
[CLI reference documentation](https://aignostics.readthedocs.io/en/latest/cli_reference.html)
379394
to learn about all commands and options available.
380395

396+
### System Health Checks
397+
398+
The CLI automatically checks system health before uploading slides or submitting runs. If the system is unhealthy, the operation is blocked and an error message is displayed:
399+
400+
```
401+
Error: Platform is not healthy: <reason>. Aborting.
402+
```
403+
404+
To override this behavior (not recommended for production use), add the `--force` flag:
405+
406+
```shell
407+
uvx aignostics application run upload he-tme metadata.csv --force
408+
uvx aignostics application run submit he-tme metadata.csv --force
409+
uvx aignostics application run execute he-tme metadata.csv data/ --force
410+
```
411+
412+
To manually check system health before running commands:
413+
414+
```shell
415+
uvx aignostics system health
416+
```
417+
381418
## Python Library: Call the Aignostics Platform API from your Python scripts
382419

383420
The Python SDK includes the *Aignostics Python Library* for integration with your Python codebase.
@@ -443,6 +480,30 @@ and read the
443480
[client reference documentation](https://aignostics.readthedocs.io/en/latest/lib_reference.html)
444481
to learn about all classes and methods.
445482

483+
### System Health Checks
484+
485+
The low-level Python SDK does **not** perform automated health checks before operations. If health verification is required for your use case, you should implement checks in your application logic:
486+
487+
```python
488+
from aignostics import platform
489+
from aignostics.system import Service as SystemService
490+
491+
# Check system health before submitting runs
492+
health = SystemService().health()
493+
if not health:
494+
raise RuntimeError(f"System is unhealthy: {health.reason}")
495+
496+
# Proceed with run submission
497+
client = platform.Client()
498+
run = client.runs.submit(...)
499+
```
500+
501+
This design gives you full control over health check behavior, allowing you to:
502+
503+
- Implement custom retry logic for transient failures
504+
- Log health status for monitoring and debugging
505+
- Gracefully handle unhealthy states in your application
506+
446507
### Example Notebooks: Interact with the Aignostics Platform from your Python Notebook environment
447508

448509
> [!IMPORTANT]

src/aignostics/application/CLAUDE.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,57 @@ Core application operations:
4848

4949
## Architecture & Design Patterns
5050

51+
### Health Check Gates
52+
53+
The application module enforces system health checks before critical operations to prevent users from uploading data or submitting runs when the platform is unavailable.
54+
55+
**CLI Health Check Enforcement (`_cli.py`):**
56+
57+
The `_abort_if_system_unhealthy()` function is called before upload and submit operations:
58+
59+
```python
60+
def _abort_if_system_unhealthy() -> None:
61+
"""Check system health and abort if unhealthy."""
62+
health = SystemService.health_static()
63+
if not health:
64+
logger.error(f"Platform is not healthy: {health.reason}. Aborting.")
65+
console.print(f"[error]Error:[/error] Platform is not healthy: {health.reason}. Aborting.")
66+
sys.exit(1)
67+
```
68+
69+
**Commands with Health Check Gates:**
70+
71+
| Command | Health Check | Override |
72+
|---------|--------------|----------|
73+
| `run execute` | Yes | `--force` |
74+
| `run upload` | Yes | `--force` |
75+
| `run submit` | Yes | `--force` |
76+
| `run prepare` | No | N/A |
77+
| `run list` | No | N/A |
78+
| `run describe` | No | N/A |
79+
| `run result download` | No | N/A |
80+
81+
**GUI Health Check Enforcement (`_gui/_page_application_describe.py`):**
82+
83+
The stepper workflow checks health at the application version selection step:
84+
85+
```python
86+
# Check system health before allowing progression
87+
system_healthy = bool(SystemService.health_static())
88+
89+
if not system_healthy:
90+
version_next_button.disable()
91+
ui.tooltip("System is unhealthy, you cannot prepare a run at this time.")
92+
93+
# Internal users (Aignostics, pre-alpha-org, LMU, Charite) can force-skip
94+
if is_internal_user:
95+
ui.checkbox("Force (skip health check)", on_change=on_force_change)
96+
```
97+
98+
**Force Option:**
99+
100+
The `submit_form.force` attribute tracks whether the user has opted to skip health checks. This is only available to internal organization users.
101+
51102
### Module Structure (NEW in v1.0.0-beta.7)
52103

53104
The application module is organized into focused submodules:

src/aignostics/gui/CLAUDE.md

Lines changed: 33 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,39 @@ The gui module provides common GUI framework components and theming for the Aign
3535

3636
**Health Monitoring:**
3737

38-
- `HEALTH_UPDATE_INTERVAL` - Configurable health check frequency
39-
- Real-time service status display in UI
40-
- Centralized health aggregation and reporting
38+
- `HEALTH_UPDATE_INTERVAL` - Configurable health check frequency (default: 30 seconds)
39+
- `USERINFO_UPDATE_INTERVAL` - User info refresh interval (default: 60 minutes)
40+
- Real-time service status display in UI footer
41+
- Centralized health aggregation and reporting via `SystemService.health_static()`
42+
43+
**Health Check Enforcement:**
44+
45+
The GUI enforces health checks before allowing critical operations:
46+
47+
- **Footer Health Indicator**: Shows "Launchpad is healthy" (green) or "Launchpad is unhealthy" (red)
48+
- **Application Run Submission**: The "Next" button in the application workflow stepper is disabled when unhealthy
49+
- **Tooltip Feedback**: Users see "System is unhealthy, you cannot prepare a run at this time."
50+
- **Force Override**: Internal users (Aignostics, pre-alpha-org, LMU, Charite organizations) can enable a "Force (skip health check)" checkbox
51+
52+
**Health State Management (`_frame.py`):**
53+
54+
```python
55+
launchpad_healthy: bool | None = None # None = loading, True = healthy, False = unhealthy
56+
57+
async def _health_load_and_render() -> None:
58+
nonlocal launchpad_healthy
59+
with contextlib.suppress(Exception):
60+
launchpad_healthy = bool(await run.cpu_bound(SystemService.health_static))
61+
health_icon.refresh()
62+
health_link.refresh()
63+
64+
ui.timer(interval=HEALTH_UPDATE_INTERVAL, callback=_update_health, immediate=True)
65+
```
66+
67+
**Health Display Components:**
68+
69+
- `health_icon()` - Settings menu icon (green check or red error)
70+
- `health_link()` - Footer link with status text and icon
4171

4272
**Error Handling:**
4373

src/aignostics/system/CLAUDE.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,60 @@ Core system operations and diagnostics:
4141

4242
## Architecture & Design Patterns
4343

44+
### Health Check Enforcement
45+
46+
The system module's health checks are used by other modules to gate critical operations. This ensures users don't submit runs or upload data when the platform is unavailable.
47+
48+
**Enforcement by Interface:**
49+
50+
| Interface | Behavior When Unhealthy | Override Mechanism |
51+
|-----------|------------------------|-------------------|
52+
| **Launchpad (GUI)** | Submit button disabled, tooltip explains issue | Internal users only: "Force" checkbox |
53+
| **CLI** | Operation aborted with error message (exit code 1) | `--force` flag on upload/submit commands |
54+
| **Python Library** | No automatic enforcement | User implements own checks |
55+
56+
**GUI Enforcement (in `application/_gui/_page_application_describe.py`):**
57+
58+
```python
59+
# Check system health and determine if force option should be available
60+
system_healthy = bool(SystemService.health_static())
61+
62+
# Disable the "Next" button if unhealthy
63+
if not system_healthy:
64+
version_next_button.disable()
65+
ui.tooltip("System is unhealthy, you cannot prepare a run at this time.")
66+
67+
# Internal users can force-skip health checks
68+
if is_internal_user:
69+
ui.checkbox("Force (skip health check)", on_change=on_force_change)
70+
```
71+
72+
**CLI Enforcement (in `application/_cli.py`):**
73+
74+
```python
75+
def _abort_if_system_unhealthy() -> None:
76+
health = SystemService.health_static()
77+
if not health:
78+
logger.error(f"Platform is not healthy: {health.reason}. Aborting.")
79+
console.print(f"[error]Error:[/error] Platform is not healthy: {health.reason}. Aborting.")
80+
sys.exit(1)
81+
82+
# Called before upload and submit operations unless --force is used
83+
if not force:
84+
_abort_if_system_unhealthy()
85+
```
86+
87+
**Python Library Usage:**
88+
89+
```python
90+
from aignostics.system import Service as SystemService
91+
92+
# Manual health check before operations
93+
health = SystemService().health()
94+
if not health:
95+
raise RuntimeError(f"System unhealthy: {health.reason}")
96+
```
97+
4498
### Health Check Aggregation Pattern
4599

46100
The system module's health check aggregates status from **ALL modules** in the SDK by discovering and querying every service that inherits from `BaseService`:

0 commit comments

Comments
 (0)