Commit Graph

247 Commits

Author SHA1 Message Date
527b841c15 Shader_IR: ICMP corrections and fixes 2019-09-21 14:28:03 -04:00
4b81d19a1a Shader_IR: Implement ICMP. 2019-09-19 20:56:29 -04:00
36abf67e79 shader/image: Implement SUATOM and fix SUST 2019-09-10 20:22:31 -03:00
77ef4fa907 shader/shift: Implement SHR wrapped and clamped variants
Nvidia defaults to wrapped shifts, but this is undefined behaviour on
OpenGL's spec. Explicitly mask/clamp according to what the guest shader
requires.
2019-09-04 01:55:24 -03:00
81fbc5370d Merge pull request #2812 from ReinUsesLisp/f2i-selector
shader_ir/conversion: Implement F2I and F2F F16 selector
2019-09-03 22:35:33 -04:00
d4f33b822b Merge pull request #2811 from ReinUsesLisp/fsetp-fix
float_set_predicate: Add missing negation bit for the second operand
2019-09-03 22:34:34 -04:00
e3534700d7 shader_ir/conversion: Split int and float selector and implement F2F H1 2019-08-28 16:09:33 -03:00
b13fbc25b8 shader_ir/conversion: Implement F2I F16 Ra.H1 2019-08-27 23:40:40 -03:00
6207751b00 float_set_predicate: Add missing negation bit for the second operand 2019-08-27 21:57:43 -03:00
4e35177e23 shader_ir: Implement VOTE
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics

Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.

To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:

* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true

ballotARB, also known as "uint64_t(activeThreadsNV())", emits

VOTE.ANY Rd, PT, PT;

on nouveau's compiler. This doesn't match exactly to Nvidia's code

VOTE.ALL Rd, PT, PT;

Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
cedc1aab4a Merge pull request #2753 from FernandoS27/float-convert
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
2019-08-21 10:27:57 -04:00
2ff8044806 shader_ir: Implement NOP 2019-08-04 03:02:55 -03:00
11f4e739bd Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
This commit takes care of implementing the F16 Variants of the 
conversion instructions and makes sure conversions are done.
2019-07-20 17:38:25 -04:00
6c4985edc9 shader/half_set_predicate: Implement missing HSETP2 variants 2019-07-19 22:20:47 -03:00
1bdb59fc6e Merge pull request #2695 from ReinUsesLisp/layer-viewport
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
2019-07-15 16:28:07 -04:00
0ec9da2f9f Merge pull request #2692 from ReinUsesLisp/tlds-f16
shader/texture: Add F16 support for TLDS
2019-07-14 08:44:38 -04:00
8a6fc529a9 shader_ir: Implement BRX & BRA.CC 2019-07-09 08:14:37 -04:00
c9d886c84e gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
This commit implements gl_ViewportIndex and gl_Layer in vertex and
geometry shaders. In the case it's used in a vertex shader, it requires
ARB_shader_viewport_layer_array. This extension is available on AMD and
Nvidia devices (mesa and proprietary drivers), but not available on
Intel on any platform. At the moment of writing this description I don't
know if this is a hardware limitation or a driver limitation.

In the case that ARB_shader_viewport_layer_array is not available,
writes to these registers on a vertex shader are ignored, with the
appropriate logging.
2019-07-07 20:42:55 -03:00
d0966b9f7c shader/texture: Add F16 support for TLDS 2019-07-07 16:05:56 -03:00
4d63f97945 shader_bytecode: Include missing <array> 2019-06-24 01:51:02 -03:00
06c4ce8645 shader: Decode SUST and implement backing image functionality 2019-06-20 21:38:33 -03:00
4e81fc8296 shader: Implement texture buffers 2019-06-20 21:36:12 -03:00
a32c52b1d8 shader_bytecode: Mark EXIT as flow instruction 2019-06-04 12:18:35 -04:00
75e7b45d69 shader/memory: Implement ST (generic memory) 2019-05-20 22:41:53 -03:00
f78ef617b6 shader/memory: Implement LD (generic memory) 2019-05-20 22:38:59 -03:00
d4df803b2b shader_ir/other: Implement IPA.IDX 2019-05-02 21:46:37 -03:00
71aa9d0877 shader_ir/memory: Implement physical input attributes 2019-05-02 21:46:25 -03:00
7632a7d6d2 shader_bytecode: Add AL2P decoding 2019-05-02 21:46:25 -03:00
da0c3bc658 Merge pull request #2407 from FernandoS27/f2f
Do some corrections in conversion shader instructions.
2019-04-20 00:42:34 -04:00
5bd5140bde Merge pull request #2348 from FernandoS27/guest-bindless
Implement Bindless Textures on Shader Decompiler and GL backend
2019-04-17 20:59:49 -04:00
0cfbd3325b Merge pull request #2315 from ReinUsesLisp/severity-decompiler
shader_ir/decode: Reduce the severity of common assertions
2019-04-16 22:21:19 -04:00
aa471274d9 Do some corrections in conversion shader instructions.
Corrects encodings for I2F, F2F, I2I and F2I
Implements Immediate variants of all four conversion types.
Add assertions to unimplemented stuffs.
2019-04-15 19:16:27 -04:00
5c280e6ff0 shader_ir: Implement STG, keep track of global memory usage and flush 2019-04-14 00:25:32 -03:00
353a099481 Merge pull request #2366 from FernandoS27/xmad-fix
Correct XMAD mode, psl and high_b on different encodings.
2019-04-09 19:15:01 -04:00
5c55ae4e18 Correct LOP_IMN encoding 2019-04-08 13:39:12 -04:00
16adc735a5 Correct XMAD mode, psl and high_b on different encodings. 2019-04-08 13:01:17 -04:00
492040bd9c Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format. 2019-04-08 11:36:11 -04:00
4841440382 Implement TXQ_B 2019-04-08 11:29:52 -04:00
ac3ba9a33e Corrections to TEX_B 2019-04-08 11:28:44 -04:00
e28fd3d0a5 Implement Bindless Samplers and TEX_B in the IR. 2019-04-08 11:23:42 -04:00
04979560fb shader_ir/memory: Reduce severity of LD_L cache management and log it 2019-04-03 17:12:44 -03:00
24abeb9a67 shader_ir/memory: Reduce severity of ST_L cache management and log it 2019-04-03 17:12:44 -03:00
633ce92908 Merge pull request #2147 from ReinUsesLisp/texture-clean
shader_ir: Remove "extras" from the MetaTexture
2019-03-10 17:28:36 -04:00
f9ee0dc7ee video_core/engines: Remove unnecessary includes
Removes a few unnecessary dependencies on core-related machinery, such
as the core.h and memory.h, which reduces the amount of rebuilding
necessary if those files change.

This also uncovered some indirect dependencies within other source
files. This also fixes those.
2019-03-05 20:35:32 -05:00
5ca63d0675 shader/decode: Remove extras from MetaTexture 2019-02-26 00:11:30 -03:00
48e6f77c03 shader/decode: Split memory and texture instructions decoding 2019-02-26 00:11:30 -03:00
10682ad7e0 shader_decompiler: Improve Accuracy of Attribute Interpolation. 2019-02-14 03:25:07 -04:00
f5ec165e8c Corrected F2I None mode to RoundEven. 2019-02-11 18:46:45 -04:00
72c70d6808 Merge pull request #2081 from ReinUsesLisp/lmem-64
shader_ir/memory: Add LD_L 64 bits loads
2019-02-05 09:17:48 -05:00
bb4549a73d Merge pull request #2082 from FernandoS27/txq-stl
Fix TXQ not using the component mask.
2019-02-04 20:22:32 -05:00