Class ComputeGraph
java.lang.Object
io.github.kirstenali.deepj.tensor.ComputeGraph
Collects GPU operations lazily and flushes them as a single command buffer.
Operations are recorded as a flat int[] command stream. Buffer IDs reference
persistent GPU-side buffers managed by a GpuRuntime. Data stays GPU-resident
between ops -- only uploaded at graph entry and downloaded on materialization.
This class is backend-agnostic: Metal, CUDA, Vulkan, etc. are all supported
by supplying the appropriate GpuRuntime implementation.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final int -
Constructor Summary
ConstructorsConstructorDescriptionComputeGraph(GpuRuntime runtime) Create a ComputeGraph backed by the given GPU runtime. -
Method Summary
Modifier and TypeMethodDescriptionCreate a Tensor backed by a GpuBuffer.Ensure a tensor has a GpuBuffer.voidflush()Flush all recorded ops to the GPU as a single command buffer.booleanisEmpty()voidMaterialize a tensor: flush pending ops if needed, then download GPU data to CPU.newOutputBuffer(int rows, int cols) Allocate a new output buffer (result of a GPU op).voidrecordAdamWUpdate(GpuBuffer w, GpuBuffer g, GpuBuffer mt, GpuBuffer vt, float lr, float beta1, float beta2, float eps, float weightDecay, float bc1, float bc2, int n) Record in-place AdamW update: [OP_ADAMW_UPDATE, wId, gId, mtId, vtId, lrBits, beta1Bits, beta2Bits, epsBits, weightDecayBits, bc1Bits, bc2Bits, n]voidrecordBinary(int opCode, GpuBuffer a, GpuBuffer b, GpuBuffer out) Record a binary element-wise op: [opCode, aId, bId, outId, n]voidrecordLayerNormBackward(GpuBuffer dXHat, GpuBuffer xHat, GpuBuffer std, GpuBuffer out, int rows, int cols) Record layer norm backward: [OP_LAYERNORM_BACKWARD, dXHatId, xHatId, stdId, outId, rows, cols]voidrecordMatmul(GpuBuffer a, GpuBuffer b, GpuBuffer out, int m, int n, int k) Record matmul: [OP_MATMUL, aId, bId, outId, m, n, k]voidrecordMultiplyScalar(GpuBuffer in, GpuBuffer out, float scalar) Record scalar multiply: [OP_MULTIPLY_SCALAR, inId, outId, scalarBits, n]voidrecordSoftmaxBackward(GpuBuffer gradOutput, GpuBuffer softmaxOut, GpuBuffer out, int rows, int cols) Record softmax backward: [OP_SOFTMAX_BACKWARD, gradId, softmaxId, outId, rows, cols]voidrecordSoftmaxRows(GpuBuffer in, GpuBuffer out, int rows, int cols) Record softmax rows: [OP_SOFTMAX_ROWS, inId, outId, rows, cols]voidrecordUnary(int opCode, GpuBuffer in, GpuBuffer out) Record a unary op: [opCode, inId, outId, n]voidRelease all GPU buffers and reset the graph completely.
-
Field Details
-
OP_ADD
public static final int OP_ADD- See Also:
-
OP_SUBTRACT
public static final int OP_SUBTRACT- See Also:
-
OP_MULTIPLY
public static final int OP_MULTIPLY- See Also:
-
OP_DIVIDE
public static final int OP_DIVIDE- See Also:
-
OP_MATMUL
public static final int OP_MATMUL- See Also:
-
OP_MULTIPLY_SCALAR
public static final int OP_MULTIPLY_SCALAR- See Also:
-
OP_SQRT
public static final int OP_SQRT- See Also:
-
OP_NEG
public static final int OP_NEG- See Also:
-
OP_EXP
public static final int OP_EXP- See Also:
-
OP_LOG
public static final int OP_LOG- See Also:
-
OP_TANH
public static final int OP_TANH- See Also:
-
OP_SIGMOID
public static final int OP_SIGMOID- See Also:
-
OP_RELU
public static final int OP_RELU- See Also:
-
OP_RELU_BACKWARD
public static final int OP_RELU_BACKWARD- See Also:
-
OP_GELU
public static final int OP_GELU- See Also:
-
OP_GELU_BACKWARD
public static final int OP_GELU_BACKWARD- See Also:
-
OP_SOFTMAX_ROWS
public static final int OP_SOFTMAX_ROWS- See Also:
-
OP_SOFTMAX_BACKWARD
public static final int OP_SOFTMAX_BACKWARD- See Also:
-
OP_LAYERNORM_BACKWARD
public static final int OP_LAYERNORM_BACKWARD- See Also:
-
OP_ADAMW_UPDATE
public static final int OP_ADAMW_UPDATE- See Also:
-
-
Constructor Details
-
ComputeGraph
Create a ComputeGraph backed by the given GPU runtime.- Parameters:
runtime- the native driver abstraction (Metal, CUDA, etc.)
-
-
Method Details
-
ensureGpuBuffer
-
newOutputBuffer
Allocate a new output buffer (result of a GPU op). Not yet allocated on native side. -
createOutputTensor
-
recordBinary
-
recordMatmul
-
recordUnary
-
recordMultiplyScalar
-
recordSoftmaxRows
-
recordSoftmaxBackward
-
recordLayerNormBackward
-
recordAdamWUpdate
public void recordAdamWUpdate(GpuBuffer w, GpuBuffer g, GpuBuffer mt, GpuBuffer vt, float lr, float beta1, float beta2, float eps, float weightDecay, float bc1, float bc2, int n) Record in-place AdamW update: [OP_ADAMW_UPDATE, wId, gId, mtId, vtId, lrBits, beta1Bits, beta2Bits, epsBits, weightDecayBits, bc1Bits, bc2Bits, n] -
isEmpty
public boolean isEmpty() -
flush
public void flush()Flush all recorded ops to the GPU as a single command buffer. After flush, GPU buffers hold computed results; CPU data is stale. -
materialize
Materialize a tensor: flush pending ops if needed, then download GPU data to CPU. -
releaseAll
public void releaseAll()Release all GPU buffers and reset the graph completely.
-