io.github.kirstenali.deepj.training.Trainer

public final class Trainer extends Object

A small, reusable training loop wrapper.

This library supports different model/data shapes (e.g. supervised Tensor->Tensor models, and causal language models that operate on token ids). Rather than duplicating full trainers, Trainer delegates a single training step to a pluggable Trainer.StepFunction.

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static interface

Trainer.StepFunction

static interface

Trainer.StepHook
Constructor Summary

Constructors

Constructor

Description

Trainer(Trainer.StepFunction stepFn)
Method Summary

Modifier and Type

Method

Description

TrainingResult

train(int maxSteps, int batchSize, int logEvery, double emaBeta, Double targetEmaLoss)

Train until maxSteps or until EMA loss goes below targetEmaLoss (if provided).

TrainingResult

train(int maxSteps, int batchSize, int logEvery, double emaBeta, Double targetEmaLoss, int releaseEverySteps)

TrainingResult

train(int maxSteps, int batchSize, int logEvery, double emaBeta, Double targetEmaLoss, int releaseEverySteps, Trainer.StepHook stepHook)

Train until maxSteps or until EMA loss goes below targetEmaLoss (if provided).

TrainingResult

train(int maxSteps, int batchSize, int logEvery, double emaBeta, Double targetEmaLoss, Trainer.StepHook stepHook)

Train until maxSteps or until EMA loss goes below targetEmaLoss (if provided).

double

trainStep(int batchSize)

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- Trainer
  
  public Trainer(Trainer.StepFunction stepFn)
Method Details
- trainStep
  
  public double trainStep(int batchSize)
- train
  
  public TrainingResult train(int maxSteps, int batchSize, int logEvery, double emaBeta, Double targetEmaLoss)
  
  Train until maxSteps or until EMA loss goes below targetEmaLoss (if provided). Uses the default periodic backend release cadence.
- train
  
  public TrainingResult train(int maxSteps, int batchSize, int logEvery, double emaBeta, Double targetEmaLoss, int releaseEverySteps)
- train
  
  public TrainingResult train(int maxSteps, int batchSize, int logEvery, double emaBeta, Double targetEmaLoss, Trainer.StepHook stepHook)
  
  Train until maxSteps or until EMA loss goes below targetEmaLoss (if provided). Uses the default periodic backend release cadence.
- train
  
  public TrainingResult train(int maxSteps, int batchSize, int logEvery, double emaBeta, Double targetEmaLoss, int releaseEverySteps, Trainer.StepHook stepHook)
  
  Train until maxSteps or until EMA loss goes below targetEmaLoss (if provided). releaseEverySteps invalid input: '<'= 0 disables periodic release, but final release still runs.

Class Trainer

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

Trainer

Method Details

trainStep

train

train

train

train