Tapeworm: Difference between revisions
No edit summary |
No edit summary |
||
| Line 1: | Line 1: | ||
Tapeworm | = Tapeworm: Managing your archive data and tape moves = | ||
<div style="border:2px solid #d32f2f; background:#ffebee; padding:12px; margin:12px 0;"> | |||
<b>Warning:</b> | |||
<ul style="margin:8px 0 0 18px;"> | |||
<li>This documentation page is under construction and may contain errors.</li> | |||
<li>The Tapeworm application is in beta and may contain errors.</li> | |||
</ul> | |||
</div> | |||
Tapeworm helps you manage data on <code>/archive</code> by identifying datasets that are no longer actively used and preparing them for tape archival. | |||
The goal is simple: keep fast storage available for active work, while safely preserving older data on tape. | |||
With Tapeworm, you can: | |||
* See which of your datasets are being considered for tape archival. | |||
* Review planned moves before they happen. | |||
* Approve, snooze, or block moves when needed. | |||
* Add metadata to help describe archived datasets. | |||
If you do nothing, Tapeworm will continue with the planned move after the review period. | |||
That is why we recommend checking your pending actions regularly. | |||
== How Tapeworm works (in plain language) == | |||
# Tapeworm scans <code>/archive</code> and builds an index of datasets, size, owner, and activity. | |||
# A policy engine checks which datasets look stale (for example: old + large). | |||
# Matching datasets are marked as <b>planned</b> and shown in your overview. | |||
# You can review and change what should happen. | |||
# If no action is taken, planned moves can become scheduled and then executed. | |||
# Data is moved to tape staging, then processed by the tape backend. | |||
== Who sees what? == | |||
* <b>Regular users</b> see only their own datasets and actions. | |||
* <b>Group admins</b> see data for their configured group(s). | |||
* <b>System admins</b> can see and manage everything. | |||
== User pages == | |||
=== 1) Overview === | |||
This is your action page. It shows items that currently need your decision. | |||
[SCREENSHOT: User Overview page with pending actions table] | |||
For each candidate, you can: | |||
* <b>Approve</b>: proceed with move scheduling. | |||
* <b>Snooze</b>: postpone the decision to a future date. | |||
* <b>Deny/Block</b>: stop this move path. | |||
* <b>Edit metadata</b>: add key/value notes for archived data. | |||
You can also select multiple rows and apply actions in bulk. | |||
=== 2) Datasets === | |||
This page shows your discovered datasets, their sizes, and activity times. | |||
[SCREENSHOT: User Datasets page] | |||
Important: | |||
* If a dataset already has an active move candidate, scheduling controls may be disabled. | |||
* The dataset list is informational; move decisions are handled through candidates/schedule. | |||
=== 3) Schedule === | |||
This page shows move candidates and their status over time. | |||
[SCREENSHOT: User Schedule page with status colors] | |||
Common statuses: | |||
* <b>Planned</b> (or <b>planned + notified</b>): under review. | |||
* <b>Scheduled</b>: move is planned for a specific date. | |||
* <b>Executing / Tape staged / On tape</b>: move is in progress or completed. | |||
* <b>Error</b>: move needs admin attention. | |||
Once a move is already executing or completed, schedule-changing actions are locked. | |||
=== 4) Overrides === | |||
Overrides tell Tapeworm to ignore specific paths in future planning. | |||
[SCREENSHOT: User Overrides page] | |||
Use overrides when: | |||
* a project is active again, | |||
* a path should stay on disk for operational reasons, | |||
* policy suggestions are not appropriate for that location. | |||
Overrides apply to the selected path and everything below it. | |||
== Notifications (email) == | |||
Tapeworm sends email updates when actions are pending or dates are approaching. | |||
[SCREENSHOT: Example Tapeworm notification email] | |||
Emails typically include: | |||
* dataset path, | |||
* size and last activity, | |||
* current status, | |||
* review/scheduled date. | |||
Please read these emails carefully — they are your chance to adjust decisions before execution. | |||
== Best practices for users == | |||
* Check your <b>Overview</b> page regularly. | |||
* Use <b>Snooze</b> if you need time to validate impact. | |||
* Add <b>metadata</b> when approving important datasets. | |||
* Use <b>Overrides</b> for known exceptions. | |||
* If unsure, contact HPC support before a scheduled move date. | |||
== FAQ == | |||
=== What happens if I do nothing? === | |||
Planned items can move forward automatically after the review window. | |||
=== Can I undo after tape staging? === | |||
Not directly in Tapeworm. Retrieval is done via the tape/iRODS workflow. | |||
=== Why is an action button disabled? === | |||
Usually because the move has already progressed (executing/staged/on tape/error), so schedule edits are no longer valid. | |||
=== Why do I see “planned + notified”? === | |||
That means the candidate is planned and a notification has already been sent. | |||
== Need help? == | |||
If anything is unclear, or you think a move is incorrect, please open an HPC support ticket. | |||
Include the dataset path and (if available) the candidate status shown in Tapeworm. | |||
Revision as of 15:11, 3 February 2026
Tapeworm: Managing your archive data and tape moves
Warning:
- This documentation page is under construction and may contain errors.
- The Tapeworm application is in beta and may contain errors.
Tapeworm helps you manage data on /archive by identifying datasets that are no longer actively used and preparing them for tape archival.
The goal is simple: keep fast storage available for active work, while safely preserving older data on tape.
With Tapeworm, you can:
- See which of your datasets are being considered for tape archival.
- Review planned moves before they happen.
- Approve, snooze, or block moves when needed.
- Add metadata to help describe archived datasets.
If you do nothing, Tapeworm will continue with the planned move after the review period. That is why we recommend checking your pending actions regularly.
How Tapeworm works (in plain language)
- Tapeworm scans
/archiveand builds an index of datasets, size, owner, and activity. - A policy engine checks which datasets look stale (for example: old + large).
- Matching datasets are marked as planned and shown in your overview.
- You can review and change what should happen.
- If no action is taken, planned moves can become scheduled and then executed.
- Data is moved to tape staging, then processed by the tape backend.
Who sees what?
- Regular users see only their own datasets and actions.
- Group admins see data for their configured group(s).
- System admins can see and manage everything.
User pages
1) Overview
This is your action page. It shows items that currently need your decision.
[SCREENSHOT: User Overview page with pending actions table]
For each candidate, you can:
- Approve: proceed with move scheduling.
- Snooze: postpone the decision to a future date.
- Deny/Block: stop this move path.
- Edit metadata: add key/value notes for archived data.
You can also select multiple rows and apply actions in bulk.
2) Datasets
This page shows your discovered datasets, their sizes, and activity times.
[SCREENSHOT: User Datasets page]
Important:
- If a dataset already has an active move candidate, scheduling controls may be disabled.
- The dataset list is informational; move decisions are handled through candidates/schedule.
3) Schedule
This page shows move candidates and their status over time.
[SCREENSHOT: User Schedule page with status colors]
Common statuses:
- Planned (or planned + notified): under review.
- Scheduled: move is planned for a specific date.
- Executing / Tape staged / On tape: move is in progress or completed.
- Error: move needs admin attention.
Once a move is already executing or completed, schedule-changing actions are locked.
4) Overrides
Overrides tell Tapeworm to ignore specific paths in future planning.
[SCREENSHOT: User Overrides page]
Use overrides when:
- a project is active again,
- a path should stay on disk for operational reasons,
- policy suggestions are not appropriate for that location.
Overrides apply to the selected path and everything below it.
Notifications (email)
Tapeworm sends email updates when actions are pending or dates are approaching.
[SCREENSHOT: Example Tapeworm notification email]
Emails typically include:
- dataset path,
- size and last activity,
- current status,
- review/scheduled date.
Please read these emails carefully — they are your chance to adjust decisions before execution.
Best practices for users
- Check your Overview page regularly.
- Use Snooze if you need time to validate impact.
- Add metadata when approving important datasets.
- Use Overrides for known exceptions.
- If unsure, contact HPC support before a scheduled move date.
FAQ
What happens if I do nothing?
Planned items can move forward automatically after the review window.
Can I undo after tape staging?
Not directly in Tapeworm. Retrieval is done via the tape/iRODS workflow.
Why is an action button disabled?
Usually because the move has already progressed (executing/staged/on tape/error), so schedule edits are no longer valid.
Why do I see “planned + notified”?
That means the candidate is planned and a notification has already been sent.
Need help?
If anything is unclear, or you think a move is incorrect, please open an HPC support ticket. Include the dataset path and (if available) the candidate status shown in Tapeworm.