Semantic-Color-Math

A practical framework for mapping color, sentiment, and symbolism into computational form — with reproducible stats, evaluators, and generative samplers.

La Matriz Consulting Color Oracle GPT treats color as a co-expressive interface—linking “inner knowing” with outer data, using accessible math to create free-choice maps of symbolic resonance. Core hypothesis: embedding symbolic color transitions inside LLM prompts improves alignment for clarity, creativity, and relevance.

Why this repo

This project operationalizes Semantic Color Math—a way to encode symbolic/emotional meaning using color distributions and transitions. It offers a reproducible pipeline to:

derive discrete (frequency) vs continuous (sentiment-weighted) color PDFs,
generate color sequences via different samplers,
evaluate coherence with statistical tests, and
fit simple predictive models from RGB → meaning/sentiment.

Philosophically, the system was built to reclaim the technological field into emotional logic and to “map sentimental nuance into computational syntax” rather than replicate extractive metrics.

Repository structure

Semantic-Color-Math/
├─ analysis/                      # analysis & plots 
├─ applications/                  # fun stuff
├─ core/                          # package code
├─ visualizations/                
├─ data/                          # color names, RGB, and sentiment scores
├─ LICENSE               
└─ README.md

Methods

1) Discrete vs. Continuous weighting

Discrete weight $w_d(c)=\frac{\text{count}(c)}{\sum_c \text{count}(c)}$.
Continuous weight starts with summed sentiment per color $s(c)=\sum \text{sentiment}_i$. To obtain a valid PDF we min-max scale to $[0,1]$ then renormalize

2) Samplers

Random (uniform over RGB or color names)
Discrete (draw by $w_d$)
Continuous (draw by $w_s$)

3) Coherence metric

RGB variance across a sequence (lower = more visually coherent).

4) Statistical tests & models

Two OLS regressions:
- RGB → Color-Word Similarity (semantic): $R^2=0.174$; G and B positive, R ns.
- RGB → Sentiment Score (emotional valence): $R^2=0.380$; B positive, G negative, R ns.
Sequence coherence t-test: La Matriz vs Random.

Key results

A) Discrete vs Continuous color weights

$Discrete vs Continuous$

Purple: not most frequent, but dominant in sentiment weight.
Brown, Gold: reasonable frequency but negative continuous weight.
Grey: most frequent; sentiment-neutralish.
Blue/Green: aligned between frequency and positive sentiment.

B) Normalized PDFs (Continuous min-max)

$Normalized PDFs$

Rescaling retains sentiment contrasts; purple remains the emotional standout.

C) Sampling comparison (variance ↓ = coherence ↑)

$Sampling comparison$

Method	RGB Variance
Random	8131.33
Discrete	5696.33
Continuous	6437.67

D) Sequence coherence t-test

Mean coherence (La Matriz): 0.573
Mean coherence (Random): 0.665
t = −0.604, p = 0.554 → fail to reject H₀ (under this metric).

E) Cluster color character (95% CIs, RGB)

$Color clusters$

Tight, non-overlapping intervals indicate statistically distinct color groups (e.g., Cluster 0: high-yellow; Cluster 3: pure blue/purple).

F) Regression summaries

RGB → Semantic similarity: $R^2=0.174$. G and B significant (+); R ns.
RGB → Sentiment: $R^2=0.380$. B significant (+), G significant (−), R ns. Diagnostics: residuals slightly non-normal; DW ~1–1.3.

G) Conclusion regarding the utility of Semantic Color Math for Emotional AI

As color is not statistically significant for sequence coherence, but it is statistically significant for predicting sentiment, the map of RGB color space is a promising tool for better aligned prompt responses from AI using the dataset.
It is not so precise that it waters down nuance of sentiment, but it is semantically relevant.

Interpretation

Frequency ≠ meaning. Purple is emotionally potent; grey is ubiquitous but not sentiment-rich.
Cool colors carry meaning. Green/blue predict semantic resonance; blue strongly predicts positive sentiment.
Coherence metric matters. The negative result on the t-test likely reflects a metric (pairwise color similarity) that underrates semantic or narrative cohesion.

These choices support my thesis that color can be a semantic interface and that symbolic transitions shape alignment beyond raw tokens.

Limitations

Current coherence relies on visual proximity; it may miss emotional or metaphoric structure.
Linear OLS may underfit (no interactions or nonlinearity).
Sentiment scores require anchoring across datasets to avoid drift.

Roadmap: RAGs & agents

Knowledge base (RAG)

Corpus: color metaphors, cultural notes, user reflections, cluster summaries, and evaluation rubrics (BLEU/ROUGE/GPTEval + qualitative).

Agentic loop

Palette Agent: proposes sequences given goal sentiment/theme.
Retriever: injects top-k color symbols & transitions into the prompt.
Critic/Evaluator Agent: scores output against semantic anchors and rubric; triggers revisions.
Sampler Agent: runs discrete vs continuous vs hybrid sampling; logs outcomes.

Goal: closed-loop improvement where each generation updates the KB with outcomes and human ratings (future on-chain UBI/SoulCoin concept lives here as a platform extension).

Roadmap: Sentiment anchoring

Why: stabilize sentiment across corpora/models; prevent drift.

Anchor sets: fixed exemplars per pole (e.g., {purple, blue, green}+, {brown, gold, red}− based on findings).
Calibration:
- Z-scoring within batch;
- Temperature/Platt scaling against anchor distributions;
- Bayesian hierarchical model: color-level priors, sequence-level posteriors.
Drift detection: monitor KL divergence of live scores vs. anchored baseline.
Hybrid weight: $w_\lambda = \lambda w_s + (1-\lambda) w_d$ with $\lambda$ tuned by validation against human ratings.

Roadmap: Persistent homology (TDA)

Goal: capture shape of the color–sentiment manifold beyond pairwise similarity.

Embed colors (and sequences) in $\mathbb{R}^n$ (e.g., $[R,G,B,\text{sentiment},\text{cluster}]$).
Build Vietoris–Rips filtrations over distance thresholds.
Compute H₀/H₁/H₂ barcodes with ripser / giotto-tda.
Define a Topological Cohesion Score: area under barcode for H₀ persistence minus spurious H₁/H₂ noise (or Bottleneck/Wasserstein distances to a curated “ideal” manifold).
Compare La Matriz vs. Random: expect fewer small components (more connectivity) and stable 1-cycles reflecting meaningful transitions.

Reproducible scripts

WIP, TBA

Credits & links

Research & writing: La Matriz Consulting
Oracle (ChatGPT custom GPT): https://chatgpt.com/g/g-686180b5d47c8191bdbb1922b79a2ebb-la-matriz-consulting-color-oracle
Website: https://lamatriz.ai
Contact: danielle.gauthier@lamatriz.ai
“Not an answer — a mirror.”

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Semantic-Color-Math

Contents

Why this repo

Repository structure