All Classes and Interfaces

Class

Description

ActivationFunction

AdamW

AdamW optimizer with per-parameter state.

BaseChatApp

Batch

One language-modeling batch: x are input token ids, y are targets (next-token ids).

Minimal byte-level tokenizers (0-255).

CausalLMTraining

Wiring helpers for causal language model training.

Collects GPU operations lazily and flushes them as a single command buffer.

CpuBackend

CrossEntropyLoss

Cross-entropy loss with integer class targets.

DeepJExecutor

Embedding

Token embedding: ids -> vectors.

FNN

Flexible fully-connected neural network (MLP) built from Linear projections.

GELU

Gaussian Error Linear Unit (GELU), using the tanh approximation popularized by GPT-2.

GPTChatService

GPTConfig

GPTModel

Minimal GPT-style decoder-only transformer for educational/training use.

GpuBuffer

Handle to a GPU-resident float buffer managed by a ComputeGraph.

GpuRuntime

Abstraction over a GPU compute runtime (Metal, CUDA, Vulkan, etc.).

Layer

Differentiable module mapping Tensor -> Tensor.

LayerNorm1D

LayerNorm over feature dimension (cols) with trainable gamma/beta exposed as Parameters.

Linear

Fully-connected layer: y = xW + b x: [n x dIn], W: [dIn x dOut], b: [1 x dOut]

MultiHeadSelfAttention

Multi-head causal self-attention for a single sequence (no batch dimension).

Parameter

Simple mutable parameter holder for optimizers.

ParameterOptimizer

Optimizer that updates a set of Parameters once per training step.

Persistable

PositionalEmbedding

Learnable positional embeddings added to token embeddings.

ReLU

Sigmoid

Softmax

Row-wise softmax for 2D tensors: applies softmax independently to each row.

SupervisedTraining

Helpers to train classic Tensor->Tensor supervised models (e.g., FNN) using the unified Trainer wrapper.

Tanh

Tensor

TensorAdapters

Utilities for converting between Tensor (double[][]) and flat float[] arrays used by GPU runtimes.

TensorBackend

TextDataset

Simple in-memory dataset that samples random contiguous chunks from token ids.

TextGenerator

Simple autoregressive text generation for GPTModel.

Example: classic MLP training using Linear + activation (via FNN) and the unified Trainer wrapper.

Trainer

A small, reusable training loop wrapper.

Example: tiny GPT training on a small text file using byte-level tokens.

TransformerBlock

Pre-LN Transformer block: x = x + Attn(LN(x)) x = x + MLP(LN(x))

TransformerBuilder

Convenience builder for assembling transformer stacks.

TransformerStack

A simple sequential stack of TransformerBlocks.