Commit Graph

290 Commits

Author SHA1 Message Date
3d91dbb21d Merge pull request #3578 from ReinUsesLisp/vmnmx
shader/video: Partially implement VMNMX
2020-04-12 10:44:03 -04:00
76f178ba6e shader/video: Partially implement VMNMX
Implements the common usages for VMNMX. Inputs with a different size
than 32 bits are not supported and sign mismatches aren't supported
either.

VMNMX works as follows:
It grabs Ra and Rb and applies a maximum/minimum on them (this is
defined by .MX), having in mind the input sign. This result can then be
saturated. After the intermediate result is calculated, it applies
another operation on it using Rc. These operations are merges,
accumulations or another min/max pass.

This instruction allows to implement with a more flexible approach GCN's
min3 and max3 instructions (for instance).
2020-04-12 00:34:42 -03:00
8b719e9e1d shader_bytecode: Rename MOV_SYS to S2R 2020-04-04 03:37:51 -03:00
9d15feb892 shader_bytecode: Add encoding for BAR 2020-04-04 03:36:21 -03:00
c02a2dc24a shader_bytecode: Add encoding for VOTE.VTG 2020-04-04 03:28:11 -03:00
c8f6d9effd shader_decode: merge GlobalAtomicOp to AtomicOp 2020-03-30 18:47:00 +07:00
08470d261d shader_bytecode: Fix I2I_IMM encoding 2020-03-28 18:49:07 -03:00
e6aff11057 Merge pull request #3520 from ReinUsesLisp/legacy-varyings
gl_shader_decompiler: Implement legacy varyings
2020-03-25 19:27:51 -04:00
6442e02c5d shader/shader_ir: Track usage in input attribute and of legacy varyings 2020-03-15 21:01:52 -03:00
93547cac68 shader_bytecode: update BFE instructions struct. 2020-03-13 12:52:16 +07:00
63a59b9935 Merge pull request #3379 from ReinUsesLisp/cbuf-offset
shader/decode: Fix constant buffer offsets
2020-02-14 13:22:53 -05:00
90df4b8e2b Merge pull request #3369 from ReinUsesLisp/shf
shader/shift: Implement SHF
2020-02-07 22:06:57 -05:00
bf9a822b87 shader/decode: Fix constant buffer offsets
Some instances were using cbuf34.offset instead of cbuf34.GetOffset().
This returned the an invalid offset. Address those instances and rename
offset to "shifted_offset" to avoid future bugs.
2020-02-05 12:19:09 -03:00
08c508b1c4 Merge pull request #3357 from ReinUsesLisp/bfi-rc
shader/bfi: Implement register-constant buffer variant
2020-02-04 15:14:13 -05:00
bf21aacc74 Merge pull request #3356 from ReinUsesLisp/fcmp
shader/arithmetic: Implement FCMP
2020-02-04 11:36:59 -05:00
017474c3f8 shader/shift: Implement SHF_LEFT_{IMM,R}
Shifts a pair of registers to the left and returns the high register.
2020-02-01 21:19:44 -03:00
137a8aa55c shader/bfi: Implement register-constant buffer variant
It's the same as the variant that was implemented, but it takes the
operands from another source.
2020-01-27 01:20:38 -03:00
e3fc3459c8 shader/arithmetic: Implement FCMP
Compares the third operand with zero, then selects between the first and
second.
2020-01-27 01:15:44 -03:00
d95d4ac843 shader/memory: Implement ATOM.ADD
ATOM operates atomically on global memory. For now only add ATOM.ADD
since that's what was found in commercial games.

This asserts for ATOM.ADD.S32 (handling the others as unimplemented),
although ATOM.ADD.U32 shouldn't be any different.

This change forces us to change the default type on SPIR-V storage
buffers from float to uint. We could also alias the buffers, but it's
simpler for now to just use uint. While we are at it, abstract the code
to avoid repetition.
2020-01-26 01:54:24 -03:00
63ba41a26d shader/memory: Implement ATOMS.ADD.U32 2020-01-16 17:30:55 -03:00
028b2718ed Merge pull request #3239 from ReinUsesLisp/p2r
shader/p2r: Implement P2R Pr
2019-12-31 20:37:16 -05:00
8a76f816a4 Merge pull request #3228 from ReinUsesLisp/ptp
shader/texture: Implement AOFFI and PTP for TLD4 and TLD4S
2019-12-26 21:43:44 -05:00
cf27b59493 shader/r2p: Refactor P2R to support P2R 2019-12-20 17:55:42 -03:00
8b26b4228b shader_bytecode: Fix TLD4S encoding 2019-12-17 23:32:10 -03:00
e09c1fbc1f shader/texture: Implement TLD4.PTP 2019-12-16 04:09:24 -03:00
af89723fa3 Shader_Ir: Correct TLD4S encoding and implement f16 flag. 2019-12-11 19:53:17 -04:00
425a254fa2 shader: Implement MEMBAR.GL
Implement using memoryBarrier in GLSL and OpMemoryBarrier on SPIR-V.
2019-12-10 16:45:03 -03:00
6233b1db08 shader_ir/memory: Implement patch stores 2019-12-09 23:25:21 -03:00
74f515e8b6 shader_bytecode: Remove corrupted character 2019-12-06 20:31:56 -03:00
cd0f5dfc17 Shader_IR: Implement TXD instruction. 2019-11-14 11:15:27 -04:00
f3d1b370aa Shader_IR: Implement FLO instruction. 2019-11-14 11:15:27 -04:00
95137a04e1 Shader_Bytecode: Add encodings for FLO, SHF and TXD 2019-11-14 11:15:26 -04:00
b6f6733131 Merge pull request #3081 from ReinUsesLisp/fswzadd-shuffles
shader: Implement FSWZADD and reimplement SHFL
2019-11-14 10:27:27 -04:00
096f339a2a video_core: Silence implicit conversion warnings 2019-11-08 22:48:50 +00:00
56e237d1f9 shader_ir/warp: Implement FSWZADD 2019-11-07 20:08:41 -03:00
9293c3a0f2 Shader_IR: Fix TLD4 and add Bindless Variant.
This commit fixes an issue where not all 4 results of tld4 were being
written, the color component was defaulted to red, among other things.
It also implements the bindless variant.
2019-10-30 12:02:03 -04:00
7fdf991097 shader_bytecode: Make Matcher constexpr capable
Greatly shrinks the amount of generated code for GetDecodeTable().

Collapses an assembly output of 9000+ lines down to ~3621 with Clang,
and 6513 down to ~2616 with GCC, given it's now allowed to construct all
the entries as a sequence of constant data.
2019-10-24 01:10:10 -04:00
376f1a4432 Merge pull request #2869 from ReinUsesLisp/suld
shader/image: Implement SULD and fix SUATOM
2019-09-23 21:47:03 -04:00
9286976948 Merge pull request #2878 from FernandoS27/icmp
shader_ir: Implement ICMP
2019-09-21 18:06:07 -03:00
44000971e2 gl_shader_decompiler: Use uint for images and fix SUATOM
In the process remove implementation of SUATOM.MIN and SUATOM.MAX as
these require a distinction between U32 and S32. These have to be
implemented with imageCompSwap loop.
2019-09-21 17:33:52 -03:00
675f23aedc shader/image: Implement SULD and remove irrelevant code
* Implement SULD as float.
* Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
2019-09-21 17:32:48 -03:00
4de0f1e1c8 shader_bytecode: Add SULD encoding 2019-09-21 17:31:46 -03:00
527b841c15 Shader_IR: ICMP corrections and fixes 2019-09-21 14:28:03 -04:00
4b81d19a1a Shader_IR: Implement ICMP. 2019-09-19 20:56:29 -04:00
0526bf1895 shader_ir/warp: Implement SHFL 2019-09-17 17:44:07 -03:00
36abf67e79 shader/image: Implement SUATOM and fix SUST 2019-09-10 20:22:31 -03:00
77ef4fa907 shader/shift: Implement SHR wrapped and clamped variants
Nvidia defaults to wrapped shifts, but this is undefined behaviour on
OpenGL's spec. Explicitly mask/clamp according to what the guest shader
requires.
2019-09-04 01:55:24 -03:00
81fbc5370d Merge pull request #2812 from ReinUsesLisp/f2i-selector
shader_ir/conversion: Implement F2I and F2F F16 selector
2019-09-03 22:35:33 -04:00
d4f33b822b Merge pull request #2811 from ReinUsesLisp/fsetp-fix
float_set_predicate: Add missing negation bit for the second operand
2019-09-03 22:34:34 -04:00
e3534700d7 shader_ir/conversion: Split int and float selector and implement F2F H1 2019-08-28 16:09:33 -03:00