Every letter you write costs energy. Your fingers flex along one axis (~225°), your wrist pivots along another (~315°), and every curve, pen-lift, and stroke crossing adds to the bill. This project measures that cost.
We model each character as a physical stroke path using Hershey vector fonts, then compute a biomechanical production effort index — not Joules, but a dimensionless quantity that is monotone with plausible effort, internally consistent, and calibratable to observed handwriting kinematics.
For each of 65 characters (A-Z, a-z, punctuation) across Latin, Greek, and Cyrillic scripts:
| Metric | What it captures |
|---|---|
| Writing energy | Two-axis motor cost (finger + wrist), direction-dependent |
| Ink distance | Total path length, pen-down only |
| Curvature | Integrated squared curvature (bending cost) |
| Pen lifts | Number of stroke segments + pen-up travel distance |
| Distinctiveness | Nearest-neighbor distance in shape space (confusability) |
| Perimetric complexity | Ink² / enclosed area — how ornate the form is |
| Convex hull ratio | How much of the bounding area the character fills |
| Topology | Enclosed regions (Betti-1), endpoints, crossings |
The directional cost model uses empirical biomechanics: finger flexion/extension is cheapest at ~225° (Thomassen & Teulings, 1983), wrist abduction at ~315° (Teulings & Maarse, 1984), with fingers costing ~1.4x wrist per unit distance (Van Galen & de Jong, 1995).
Writing systems optimize for a trade-off between cheapness and distinctiveness.
Left: the Pareto frontier — characters that are optimally cheap to write for
their level of distinctiveness. Right: uppercase (cyan) vs lowercase (magenta)
show different trade-off strategies.
Top-left: energy distributions across scripts. Top-right: cognate pairs
(A, B, E, etc.) show near-identical energy in Latin vs Greek simplex fonts.
Bottom-left: energy scales linearly with ink path. Bottom-right: Cyrillic
complex (serif) has ~3x the perimetric complexity of simplex scripts.
Eight metrics across all 65 characters, sorted by rank. Each metric shows
distinct distributional shape — energy and ink follow Zipf-like decay, while
convex hull ratio is nearly uniform.
Top-left: characters with cheap exits tend to have expensive entries (and vice
versa). Bottom-left: transition angles cluster around finger-axis and wrist-axis
directions. Bottom-right: the most expensive bigrams by frequency × transition cost.
Enclosed regions (Betti-1): most characters have 0 (open forms like C, L) or
1 (closed forms like O, D). Characters with more endpoints cost more energy.
More crossings reduce confusability margin.
Frequency ≠ energy. Common letters are not systematically cheaper to write (r = +0.054). Writing systems don't appear to optimize individual character cost by usage frequency — they optimize the alphabet as a whole for the cheapness/distinctiveness trade-off.
No correlation between letter frequency (Norvig 2013 English corpus) and
writing energy. The alphabet is not a frequency-optimized code.
cd src/
python3 primitives.py # Core measurement functions
python3 measure.py # Extract strokes from Hershey fonts
python3 analyze.py # Energy distributions + Pareto frontier
python3 bigram_transition_analysis.py # Between-character transition costs
python3 cross_script_analysis.py # Latin vs Greek vs Cyrillic
python3 explore_correlations.py # Metric correlation mining
python3 pareto_frontier.py # Optimality analysisPython 3.8+, numpy, matplotlib, scipy
- Voynich Manuscript Structural Analysis — character energy concepts applied to analyzing an undeciphered manuscript's writing system structure.