Skip to content

shader_recompiler: Implement reg type tracking#4247

Open
raphaelthegreat wants to merge 6 commits intoshadps4-emu:mainfrom
raphaelthegreat:type-track
Open

shader_recompiler: Implement reg type tracking#4247
raphaelthegreat wants to merge 6 commits intoshadps4-emu:mainfrom
raphaelthegreat:type-track

Conversation

@raphaelthegreat
Copy link
Copy Markdown
Contributor

@raphaelthegreat raphaelthegreat commented Apr 10, 2026

In the GCN architecture general purpose registers are mostly 32-bit wide, with the exception of VCC which is 64-bit (though its lower and upper 32-bit parts can also be used as registers). EXEC is also 64-bit and is used to modify the control flow of instructions, but its technically not general purpose even though most instructions can write to or read from it. With it being a 64-bit register, its manipulation is usually done with 64-bit arithmetic instructions.

For the simplicity of the generated code the recompiler has the concept of thread-bit type, which in the guest code is a subgroup-shared 64-bit bitmask meant to be stored into EXEC at some point, but in IR its represented as a boolean condition local to each thread. This makes resulting code more sane.

To avoid fighting the IR type system, even though the scalar and thread-bit SGPRs/VCC are the same registers, they are treated as separate register spaces in SSA. That works because guest shaders will not mix and match them. A thread-bit mask will be generated by specific instructions like V_CMP, consumed and then the SGPRs will be overwritten by scalar operations.

However cases have started to appear where this separation is breached or certain instructions which are ambiguous in nature, where its not certain which register space could be used. Some of these are the MBCNT instructions and the CMP_U64 family of instructions. The former until this point also wasn't actually implemented, rather substituted with a heuristic implementation suited to most of its practical uses. This PR replaces the heuristic with an actual implementation as well and adjusts DataAppend/DataConsume to work with the new implementation is mimics how the HW instruction works. The previous heuristic generated much cleaner code, but I believe the cost is negligible.

Type tracking is done as simply as possible, reg state is kept in a per CFG block structure. For each new CFG block, all processed predecessors are checked to "inherit" the state. If there are multiple predecessors the states are compared, if the state of a reg mismatches, its set as undefined (its expected the new block should not touch it then or overwrite the value to a defined type)

@raphaelthegreat
Copy link
Copy Markdown
Contributor Author

PS I've also been thinking if unifying the reg space is a good idea or starting by emitting IR code faithfully to guest code and having a post pass optimization to booleans, though that would take a lot more work and not sure if it would be strictly better than this

@StevenMiller123
Copy link
Copy Markdown
Collaborator

Fixes various unique rendering issues Final Fantasy VII Remake has, bringing it in-line with how other UE titles behave.
image
image

@StevenMiller123
Copy link
Copy Markdown
Collaborator

Brings Resident Evil 2 ingame with severe rendering issues.
image
image

@Randomuser8219
Copy link
Copy Markdown
Contributor

image image image Brings KNACK 2 in-game.

@StevenMiller123
Copy link
Copy Markdown
Collaborator

Regresses Shadow of the Colossus, the game now crashes my GPU driver before it can start rendering the little intro cutscene it has.

[Debug] <Critical> (shadPS4:GpuComm) vk_scheduler.cpp:164 operator(): Assertion Failed!
Device lost during submit

CUSA08034.log

@StevenMiller123
Copy link
Copy Markdown
Collaborator

StevenMiller123 commented Apr 10, 2026

Also regresses Marvel's Spider-Man, the game now crashes on a device lost on the loading screen for a new game.

[Debug] <Critical> (shadPS4:GpuComm) vk_scheduler.cpp:164 operator(): Assertion Failed!
Device lost during submit

CUSA02299.log

@StevenMiller123
Copy link
Copy Markdown
Collaborator

Star Wars Jedi: Fallen Order is back to running now, but has severe graphical issues.
Crashes on typical UE issues.
image
image
image

CUSA12539.log

@DanielSvoboda
Copy link
Copy Markdown
Member

This PR solves the problem in Elden Ring, where the screen flashes several blocks.

Pre-release e16a59b
image

PR 4247
image

@StevenMiller123
Copy link
Copy Markdown
Collaborator

God of War is now back to crashing from missing DS_ORDERED_COUNT

[Render.Recompiler] <Error> (shadPS4:GpuComm) translate.cpp:744 LogMissingOpcode: Unknown opcode DS_ORDERED_COUNT (1087, category = DataShare)
[Render.Recompiler] <Error> (shadPS4:GpuComm) translate.cpp:744 LogMissingOpcode: Unknown opcode DS_ORDERED_COUNT (1087, category = DataShare)
[Debug] <Critical> (shadPS4:GpuComm) structured_control_flow.cpp:815 operator(): Assertion Failed!
Shader translation has failed

CUSA07408.log

@Randomuser8219
Copy link
Copy Markdown
Contributor

Graphically regressed Ratchet & Clank (CUSA01047). Now has this weird blocky artifacting.
image

@Parotaku
Copy link
Copy Markdown

Makes 'No straight roads' display video again...
image
shad_log.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants