debugging#
- class fkat.pytorch.callbacks.debugging.Introspection(checksums: Optional[set[str]] = None, tensor_stats: Optional[set[str]] = None, env_vars: bool = False, pip_freeze: bool = False, output_path_prefix: Optional[str] = None, schedule: Optional[Schedule] = None)[source]#
- on_before_optimizer_step(trainer: Trainer, pl_module: LightningModule, optimizer: Optimizer) None[source]#
Called before
optimizer.step().
- on_train_batch_end(trainer: L.Trainer, pl_module: L.LightningModule, outputs: STEP_OUTPUT, batch: Any, batch_idx: int) None[source]#
Called when the train batch ends.
Note
The value
outputs["loss"]here will be the normalized value w.r.taccumulate_grad_batchesof the loss returned fromtraining_step.
- on_train_batch_start(trainer: Trainer, pl_module: LightningModule, batch: Any, batch_idx: int) None[source]#
Called when the train batch begins.
- on_train_end(trainer: Trainer, pl_module: LightningModule) None[source]#
Remove hooks at the end of training
- class fkat.pytorch.callbacks.debugging.OptimizerSnapshot(output_path_prefix: str, schedule: Optional[Schedule] = None)[source]#
Callback that saves optimizer state at specified intervals during training.
This callback allows you to capture the state of optimizers at specific points during training, which can be useful for debugging, analysis, or resuming training from specific optimization states.
- Parameters:
output_path_prefix (str) – Output path prefix for generated optimizer snapshots.
schedule (Optional[Schedule]) – Schedule at which to take a snapshot of optimizers. Defaults to
Never