Skip synchronized(unsafeTags) on owner-thread tag writes#11082
Draft
Skip synchronized(unsafeTags) on owner-thread tag writes#11082
Conversation
Spans are almost always written by a single thread, so the lock on every setTag/setMetric call is uncontended overhead. This adds a volatile tagWriteState check: if the current thread is the span's creating thread (STATE_OWNER), tag writes skip the lock entirely. Non-owner threads and post-finish writes take the lock and sticky-transition to STATE_SHARED. Long-running spans disable the optimization at construction since the writer thread may read tags on unfinished spans. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 58 metrics, 13 unstable metrics. Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~d664ab2db5, baseline=1.62.0-SNAPSHOT~5ab378f780
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.06 s) : 0, 1059848
Total [baseline] (8.839 s) : 0, 8839209
Agent [candidate] (1.074 s) : 0, 1073851
Total [candidate] (8.886 s) : 0, 8885932
section iast
Agent [baseline] (1.223 s) : 0, 1223212
Total [baseline] (9.586 s) : 0, 9585743
Agent [candidate] (1.233 s) : 0, 1232543
Total [candidate] (9.575 s) : 0, 9574588
gantt
title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~d664ab2db5, baseline=1.62.0-SNAPSHOT~5ab378f780
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.256 ms) : 0, 1256
crashtracking [candidate] (1.243 ms) : 0, 1243
BytebuddyAgent [baseline] (636.08 ms) : 0, 636080
BytebuddyAgent [candidate] (642.295 ms) : 0, 642295
AgentMeter [baseline] (29.748 ms) : 0, 29748
AgentMeter [candidate] (29.797 ms) : 0, 29797
GlobalTracer [baseline] (249.211 ms) : 0, 249211
GlobalTracer [candidate] (253.92 ms) : 0, 253920
AppSec [baseline] (31.95 ms) : 0, 31950
AppSec [candidate] (32.769 ms) : 0, 32769
Debugger [baseline] (59.284 ms) : 0, 59284
Debugger [candidate] (60.663 ms) : 0, 60663
Remote Config [baseline] (596.164 µs) : 0, 596
Remote Config [candidate] (623.001 µs) : 0, 623
Telemetry [baseline] (8.08 ms) : 0, 8080
Telemetry [candidate] (8.321 ms) : 0, 8321
Flare Poller [baseline] (7.392 ms) : 0, 7392
Flare Poller [candidate] (7.446 ms) : 0, 7446
section iast
crashtracking [baseline] (1.238 ms) : 0, 1238
crashtracking [candidate] (1.252 ms) : 0, 1252
BytebuddyAgent [baseline] (800.362 ms) : 0, 800362
BytebuddyAgent [candidate] (806.847 ms) : 0, 806847
AgentMeter [baseline] (11.344 ms) : 0, 11344
AgentMeter [candidate] (11.435 ms) : 0, 11435
GlobalTracer [baseline] (239.127 ms) : 0, 239127
GlobalTracer [candidate] (241.228 ms) : 0, 241228
AppSec [baseline] (31.811 ms) : 0, 31811
AppSec [candidate] (30.336 ms) : 0, 30336
Debugger [baseline] (58.642 ms) : 0, 58642
Debugger [candidate] (61.841 ms) : 0, 61841
Remote Config [baseline] (1.15 ms) : 0, 1150
Remote Config [candidate] (533.293 µs) : 0, 533
Telemetry [baseline] (13.635 ms) : 0, 13635
Telemetry [candidate] (12.972 ms) : 0, 12972
Flare Poller [baseline] (3.502 ms) : 0, 3502
Flare Poller [candidate] (3.485 ms) : 0, 3485
IAST [baseline] (25.892 ms) : 0, 25892
IAST [candidate] (25.875 ms) : 0, 25875
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~d664ab2db5, baseline=1.62.0-SNAPSHOT~5ab378f780
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.06 s) : 0, 1059893
Total [baseline] (11.074 s) : 0, 11073783
Agent [candidate] (1.063 s) : 0, 1063158
Total [candidate] (11.144 s) : 0, 11143810
section appsec
Agent [baseline] (1.251 s) : 0, 1250721
Total [baseline] (11.118 s) : 0, 11118258
Agent [candidate] (1.257 s) : 0, 1256515
Total [candidate] (11.117 s) : 0, 11116930
section iast
Agent [baseline] (1.225 s) : 0, 1225099
Total [baseline] (10.552 s) : 0, 10552399
Agent [candidate] (1.246 s) : 0, 1246267
Total [candidate] (11.331 s) : 0, 11330949
section profiling
Agent [baseline] (1.188 s) : 0, 1187723
Total [baseline] (11.222 s) : 0, 11222371
Agent [candidate] (1.206 s) : 0, 1205913
Total [candidate] (11.209 s) : 0, 11209296
gantt
title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~d664ab2db5, baseline=1.62.0-SNAPSHOT~5ab378f780
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.234 ms) : 0, 1234
crashtracking [candidate] (1.245 ms) : 0, 1245
BytebuddyAgent [baseline] (634.079 ms) : 0, 634079
BytebuddyAgent [candidate] (635.92 ms) : 0, 635920
AgentMeter [baseline] (29.532 ms) : 0, 29532
AgentMeter [candidate] (29.54 ms) : 0, 29540
GlobalTracer [baseline] (249.946 ms) : 0, 249946
GlobalTracer [candidate] (250.928 ms) : 0, 250928
AppSec [baseline] (32.104 ms) : 0, 32104
AppSec [candidate] (32.12 ms) : 0, 32120
Debugger [baseline] (60.432 ms) : 0, 60432
Debugger [candidate] (60.621 ms) : 0, 60621
Remote Config [baseline] (601.648 µs) : 0, 602
Remote Config [candidate] (608.878 µs) : 0, 609
Telemetry [baseline] (8.15 ms) : 0, 8150
Telemetry [candidate] (8.162 ms) : 0, 8162
Flare Poller [baseline] (7.516 ms) : 0, 7516
Flare Poller [candidate] (7.533 ms) : 0, 7533
section appsec
crashtracking [baseline] (1.224 ms) : 0, 1224
crashtracking [candidate] (1.255 ms) : 0, 1255
BytebuddyAgent [baseline] (663.323 ms) : 0, 663323
BytebuddyAgent [candidate] (665.984 ms) : 0, 665984
AgentMeter [baseline] (12.105 ms) : 0, 12105
AgentMeter [candidate] (12.107 ms) : 0, 12107
GlobalTracer [baseline] (249.837 ms) : 0, 249837
GlobalTracer [candidate] (250.849 ms) : 0, 250849
IAST [baseline] (24.526 ms) : 0, 24526
IAST [candidate] (24.792 ms) : 0, 24792
AppSec [baseline] (184.685 ms) : 0, 184685
AppSec [candidate] (185.503 ms) : 0, 185503
Debugger [baseline] (65.841 ms) : 0, 65841
Debugger [candidate] (66.382 ms) : 0, 66382
Remote Config [baseline] (616.178 µs) : 0, 616
Remote Config [candidate] (605.866 µs) : 0, 606
Telemetry [baseline] (8.593 ms) : 0, 8593
Telemetry [candidate] (8.665 ms) : 0, 8665
Flare Poller [baseline] (3.571 ms) : 0, 3571
Flare Poller [candidate] (3.594 ms) : 0, 3594
section iast
crashtracking [baseline] (1.226 ms) : 0, 1226
crashtracking [candidate] (1.256 ms) : 0, 1256
BytebuddyAgent [baseline] (800.481 ms) : 0, 800481
BytebuddyAgent [candidate] (816.191 ms) : 0, 816191
AgentMeter [baseline] (11.399 ms) : 0, 11399
AgentMeter [candidate] (11.663 ms) : 0, 11663
GlobalTracer [baseline] (239.483 ms) : 0, 239483
GlobalTracer [candidate] (243.538 ms) : 0, 243538
IAST [baseline] (25.857 ms) : 0, 25857
IAST [candidate] (26.162 ms) : 0, 26162
AppSec [baseline] (29.043 ms) : 0, 29043
AppSec [candidate] (32.101 ms) : 0, 32101
Debugger [baseline] (63.822 ms) : 0, 63822
Debugger [candidate] (61.43 ms) : 0, 61430
Remote Config [baseline] (1.197 ms) : 0, 1197
Remote Config [candidate] (533.065 µs) : 0, 533
Telemetry [baseline] (12.679 ms) : 0, 12679
Telemetry [candidate] (13.116 ms) : 0, 13116
Flare Poller [baseline] (3.584 ms) : 0, 3584
Flare Poller [candidate] (3.484 ms) : 0, 3484
section profiling
crashtracking [baseline] (1.196 ms) : 0, 1196
crashtracking [candidate] (1.232 ms) : 0, 1232
BytebuddyAgent [baseline] (692.325 ms) : 0, 692325
BytebuddyAgent [candidate] (704.741 ms) : 0, 704741
AgentMeter [baseline] (9.128 ms) : 0, 9128
AgentMeter [candidate] (9.302 ms) : 0, 9302
GlobalTracer [baseline] (207.897 ms) : 0, 207897
GlobalTracer [candidate] (210.259 ms) : 0, 210259
AppSec [baseline] (32.605 ms) : 0, 32605
AppSec [candidate] (33.124 ms) : 0, 33124
Debugger [baseline] (66.028 ms) : 0, 66028
Debugger [candidate] (66.832 ms) : 0, 66832
Remote Config [baseline] (570.605 µs) : 0, 571
Remote Config [candidate] (588.57 µs) : 0, 589
Telemetry [baseline] (7.822 ms) : 0, 7822
Telemetry [candidate] (7.974 ms) : 0, 7974
Flare Poller [baseline] (3.562 ms) : 0, 3562
Flare Poller [candidate] (3.571 ms) : 0, 3571
ProfilingAgent [baseline] (95.25 ms) : 0, 95250
ProfilingAgent [candidate] (95.56 ms) : 0, 95560
Profiling [baseline] (95.828 ms) : 0, 95828
Profiling [candidate] (96.135 ms) : 0, 96135
LoadParameters
See matching parameters
SummaryFound 0 performance improvements and 1 performance regressions! Performance is the same for 20 metrics, 15 unstable metrics.
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~d664ab2db5, baseline=1.62.0-SNAPSHOT~5ab378f780
dateFormat X
axisFormat %s
section baseline
no_agent (19.251 ms) : 19054, 19448
. : milestone, 19251,
appsec (18.682 ms) : 18493, 18872
. : milestone, 18682,
code_origins (18.426 ms) : 18243, 18608
. : milestone, 18426,
iast (17.721 ms) : 17545, 17897
. : milestone, 17721,
profiling (19.649 ms) : 19448, 19851
. : milestone, 19649,
tracing (18.015 ms) : 17838, 18191
. : milestone, 18015,
section candidate
no_agent (19.144 ms) : 18948, 19340
. : milestone, 19144,
appsec (18.649 ms) : 18460, 18837
. : milestone, 18649,
code_origins (18.188 ms) : 18009, 18368
. : milestone, 18188,
iast (18.079 ms) : 17897, 18261
. : milestone, 18079,
profiling (19.298 ms) : 19105, 19490
. : milestone, 19298,
tracing (18.076 ms) : 17896, 18256
. : milestone, 18076,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~d664ab2db5, baseline=1.62.0-SNAPSHOT~5ab378f780
dateFormat X
axisFormat %s
section baseline
no_agent (1.243 ms) : 1231, 1255
. : milestone, 1243,
iast (3.289 ms) : 3247, 3332
. : milestone, 3289,
iast_FULL (6.228 ms) : 6164, 6293
. : milestone, 6228,
iast_GLOBAL (3.574 ms) : 3520, 3627
. : milestone, 3574,
profiling (2.459 ms) : 2434, 2483
. : milestone, 2459,
tracing (1.934 ms) : 1918, 1950
. : milestone, 1934,
section candidate
no_agent (1.228 ms) : 1216, 1240
. : milestone, 1228,
iast (3.208 ms) : 3167, 3249
. : milestone, 3208,
iast_FULL (5.991 ms) : 5930, 6051
. : milestone, 5991,
iast_GLOBAL (3.611 ms) : 3552, 3670
. : milestone, 3611,
profiling (2.379 ms) : 2354, 2405
. : milestone, 2379,
tracing (1.878 ms) : 1862, 1894
. : milestone, 1878,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics. Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~d664ab2db5, baseline=1.62.0-SNAPSHOT~5ab378f780
dateFormat X
axisFormat %s
section baseline
no_agent (15.317 s) : 15317000, 15317000
. : milestone, 15317000,
appsec (14.968 s) : 14968000, 14968000
. : milestone, 14968000,
iast (17.984 s) : 17984000, 17984000
. : milestone, 17984000,
iast_GLOBAL (18.113 s) : 18113000, 18113000
. : milestone, 18113000,
profiling (15.227 s) : 15227000, 15227000
. : milestone, 15227000,
tracing (15.062 s) : 15062000, 15062000
. : milestone, 15062000,
section candidate
no_agent (15.514 s) : 15514000, 15514000
. : milestone, 15514000,
appsec (14.874 s) : 14874000, 14874000
. : milestone, 14874000,
iast (18.093 s) : 18093000, 18093000
. : milestone, 18093000,
iast_GLOBAL (17.97 s) : 17970000, 17970000
. : milestone, 17970000,
profiling (14.798 s) : 14798000, 14798000
. : milestone, 14798000,
tracing (14.987 s) : 14987000, 14987000
. : milestone, 14987000,
Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~d664ab2db5, baseline=1.62.0-SNAPSHOT~5ab378f780
dateFormat X
axisFormat %s
section baseline
no_agent (1.489 ms) : 1477, 1500
. : milestone, 1489,
appsec (3.71 ms) : 3498, 3923
. : milestone, 3710,
iast (2.283 ms) : 2214, 2352
. : milestone, 2283,
iast_GLOBAL (2.328 ms) : 2258, 2398
. : milestone, 2328,
profiling (2.103 ms) : 2048, 2158
. : milestone, 2103,
tracing (2.096 ms) : 2042, 2150
. : milestone, 2096,
section candidate
no_agent (1.49 ms) : 1479, 1502
. : milestone, 1490,
appsec (3.85 ms) : 3628, 4072
. : milestone, 3850,
iast (2.278 ms) : 2209, 2348
. : milestone, 2278,
iast_GLOBAL (2.332 ms) : 2262, 2402
. : milestone, 2332,
profiling (2.099 ms) : 2044, 2154
. : milestone, 2099,
tracing (2.081 ms) : 2027, 2135
. : milestone, 2081,
|
Add three targeted concurrency tests that exercise the exact cross-thread tag write pattern the JMH crossThread benchmark was measuring: - crossThreadSustainedNoCrash: 8 threads × 10k setTag on same span - ownerToSharedTransition: owner writes first, then 8 threads join - manySpansCrossThread: 10k short-lived spans tagged from 8 threads All pass, proving the production code handles cross-thread writes without NPE or structural corruption. Fix the crossThread benchmark: change SharedSpan @setup from Level.Invocation to Level.Iteration. With Level.Invocation, 8 threads raced to call setup() concurrently, causing NPE when state.span was transiently null between invocations — a benchmark harness bug, not a production code bug. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What Does This Do
Optimizes
DDSpanContexttag writes by skippingsynchronized(unsafeTags)when the writing thread is the span's creating thread (the common case). A volatiletagWriteStatefield tracks whether the span is in owner-only mode or shared mode. Once any non-owner thread accesses tags or the span finishes, it transitions to shared mode permanently and all subsequent accesses take the lock.Motivation
Every
setTag()/setMetric()call acquiressynchronized(unsafeTags)— ~27 lock sites total. Spans are almost always written by a single thread, so the lock is uncontended overhead (~20-30ns per acquire/release × 5-20 tags per span × millions of spans/second). This optimization eliminates that overhead on the fast path by replacing the lock with a volatile read + thread ID comparison.Safety model:
volatile int tagWriteStatetracksSTATE_OWNER(0) vsSTATE_SHARED(1). Once shared, never reverts.STATE_SHAREDfinish()callstransitionToShared()so post-finish writes (decorators/handlers) always take the lockBenchmark Results
JMH benchmarks, 2 forks × 5 warmup + 5 measurement iterations, back-to-back on same machine. Owner-thread benchmarks use
@Threads(1), cross-thread uses@Threads(8).JDK 21 (Zulu 21.0.1) — biased locking removed
fullLifecycle_tenTagssetStringTag_ownerThreadsetIntTag_ownerThreadsetTenTags_ownerThreadsetStringTag_crossThread(8T)JDK 8 (Zulu 8u372) — biased locking enabled
fullLifecycle_tenTagssetStringTag_ownerThreadsetIntTag_ownerThreadsetTenTags_ownerThreadsetStringTag_crossThread(8T)Analysis
On JDK 21 (where biased locking was removed in JDK 15), uncontended
synchronizedis more expensive and the optimization shows clear gains: +12-33% across all owner-thread benchmarks. The full lifecycle benchmark (create + 10 tags + finish) shows +11.8% throughput improvement.On JDK 8 (biased locking enabled), the JVM already optimizes uncontended locks, so gains are modest (+4-12%).
Cross-thread path: no regression on either JDK. The slow path (non-owner threads) takes the lock just like before, with one additional volatile read of
tagWriteState.Additional Notes
synchronized(unsafeTags)is kept on reader paths:processTagsAndBaggage,earlyProcessTags,getTags,toStringDDSpanContextConcurrencyTest(9 JUnit 5 tests including 3 targeted cross-thread stress tests) andSpanTagBenchmark(5 JMH benchmarks)Contributor Checklist
type:and (comp:orinst:) labels in addition to any other useful labelsclose,fix, or any linking keywords when referencing an issueNote: Once your PR is ready to merge, add it to the merge queue by commenting
/merge./merge -ccancels the queue request./merge -f --reason "reason"skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.