TALTangerine Lattice · v2.0-α
Inputs

Datasets

A dataset is a collection of records ingested into TAL. Each record has numeric fields, categorical fields, and optional hints. TAL fingerprints the dataset deterministically on ingest — the same records always produce the same fingerprint, regardless of upload order.

Record schema

{
  "id"?:                string,         // optional — derived deterministically if omitted
  "numericFields":      { [k: string]: number },
  "categoricalFields":  { [k: string]: string },
  "hints"?:             string[],       // optional categorization hints
  "sourceTrace"?:       {
    "origin":           "upload" | "api" | "sample" | "sync",
    "originRef"?:       string,         // filename / endpoint / sync source
    "rowIndex"?:        number
  }
}

Sample datasets

Six canonical demos ship with TAL. Use them to exercise the engine without bringing your own data.

landscaping18 rows

18 records, 3 numeric fields (revenue, jobs, leads), 2 categorical (region, service). The canonical worked example.

3 numeric2 categorical
ecommerce1,024 rows

Synthetic D2C orders with revenue, units, AOV, discount %, channel, region.

6 numeric3 categorical
saas512 rows

Monthly account metrics: MRR, seats, support tickets, churn risk score, plan tier.

5 numeric2 categorical
restaurant365 rows

Daily covers, average ticket, wait time, table turn, day-of-week, weather bucket.

4 numeric3 categorical
agency256 rows

Client engagement: hours billed, retainer size, project stage, NPS, vertical.

4 numeric2 categorical
fitness480 rows

Member activity: visits, class signups, plan tier, churn flag, acquisition source.

4 numeric3 categorical

Dataset lifecycle

  1. 01   Upload or POST records to /api/v2/datasets. TAL canonicalizes and fingerprints.
  2. 02   Attach one or more hierarchy trees. The same records can be projected through multiple trees (MHC).
  3. 03   Run an analysis. TAL emits findings with per-result confidence and provenance.
  4. 04   Re-run any audit-logged job via verify — output must be byte-identical.