Commit Graph

5622 Commits

Author SHA1 Message Date
e8bd9aed8b video_core: Use a CV for blocking commands.
There is no need for a busy loop here. Let's just use a condition variable to save some power.
2021-04-07 22:38:52 +02:00
e6fb49fa4b video_core/gpu_thread: Keep the write lock for allocating the fence.
Else the fence might get submited out-of-order into the queue, which makes testing them pointless.
Overhead should be tiny as the mutex is just moved from the queue to the writing code.
2021-04-07 22:38:52 +02:00
5145133a60 video_core/gpu_thread: Implement a ShutDown method.
This was implicitly done by `is_powered_on = false`, however the explicit method allows us to block until the GPU is actually gone.

This should fix a race condition while removing the other subsystems while the GPU is still active.
2021-04-07 22:38:52 +02:00
4aec060f6d common/threadsafe_queue: Provide Wait() method.
It shall block until there is something to consume in the queue.

And use it for the GPU emulation instead of the spin loop.
This is only in booting the emulator, however in BOTW this is the case for about 1 second.
2021-04-07 22:38:52 +02:00
a60653dcd3 vp9: Avoid memcpy with null pointers
Avoid sending null pointer to memcpy as reported by Undefined Behaviour
Sanitizer. Replaces the std::memcpy calls in SpliceVectors with
std::copy calls. Opting to replace all the memcpy's with copy's.

Co-authored-by: LC <mathew1800@gmail.com>
2021-04-05 00:44:38 -04:00
5ee669466f Merge pull request #5927 from ameerj/astc-compute
video_core: Accelerate ASTC texture decoding using compute shaders
2021-03-30 19:31:52 -03:00
bf1c1788ca nvdrv: Cleanup CDMA Processor on device closure
Brings us a step closer to unifying all channels to share a common interface.
2021-03-30 20:37:40 +11:00
9b50b23a50 vulkan_common: enable OpenGL interop on other Unices 2021-03-30 00:25:25 +00:00
2f83d9a61b astc_decoder: Refactor for style and more efficient memory use 2021-03-25 16:53:51 -04:00
8c016b02e7 gl_device: unblock async shaders on other Unix systems
Mesa is the primary OpenGL provider on all FreeDesktop systems.
For example, iris is used on Intel GPU + FreeBSD by default.
2021-03-24 19:59:20 +00:00
538f097f97 gl_device: Block async shaders on AMD and Intel
Currently, the Windows versions of the Intel OpenGL driver and the AMD
proprietary OpenGL driver do not properly support (or in fact degrade)
when asynchronous shader compilation is enabled. This blocks
specifically those drivers from using this feature. This affects
AMDGPU-PRO on Linux, and AMD's and Intel's OpenGL drivers on Windows.
2021-03-21 01:25:45 -04:00
2f30c10584 astc_decoder: Reimplement Layers
Reimplements the approach to decoding layers in the compute shader. Fixes multilayer astc decoding when using Vulkan.
2021-03-13 12:16:03 -05:00
c7553abe89 astc_decoder: Fix out of bounds memory access
resolves a crash with some anamolous textures found in Astral Chain.
2021-03-13 12:16:03 -05:00
20eb368e14 renderer_vulkan: Accelerate ASTC decoding
Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2021-03-13 12:16:03 -05:00
f6566338eb host_shaders: Modify shader cmake integration to allow for larger shaders
using a raw string to encapsulate the entire shader code limits us to shaders of size less than 2KB. This change overcomes this limitation.
2021-03-13 12:16:03 -05:00
2985e5e94c renderer_opengl: Accelerate ASTC texture decoding with a compute shader
ASTC texture decoding is currently handled by a CPU decoder for GPU's without native ASTC decoding support (most desktop GPUs). This is the cause for noticeable performance degradation in titles which use the format extensively.

This commit adds support to accelerate ASTC decoding using a compute shader on OpenGL for GPUs without native support.
2021-03-13 12:16:03 -05:00
4735d18bb9 Merge pull request #6028 from bunnei/raster-cache
video_core: rasterizer_accelerated: Use a flat array instead of interval_map for cached pages.
2021-03-12 21:57:27 -08:00
a9d24b0df3 video_core: rasterizer_accelerated: Fix un/signed mismatch. 2021-03-12 21:52:49 -08:00
daf5c5060b Merge pull request #5891 from ameerj/bgra-ogl
renderer_opengl: Use compute shaders to swizzle BGR textures on copy
2021-03-09 02:47:51 -03:00
d1a7b2eca7 Merge pull request #6021 from ReinUsesLisp/skip-cache-heuristic
buffer_cache: Heuristically decide to skip cache on uniform buffers
2021-03-08 17:48:55 -08:00
5213f70230 texture_cache: Blacklist BGRA8 copies and views on OpenGL
In order to force the BGRA8 conversion on Nvidia using OpenGL, we need to forbid texture copies and views with other formats.

This commit also adds a boolean relating to this, as this needs to be done only for the OpenGL api, Vulkan must remain unchanged.
2021-03-04 14:14:49 -05:00
0639244d85 renderer_opengl: Swizzle BGR textures on copy
OpenGL does not natively support BGR internal formats, which causes many BGR textures to render incorrectly, with Red and Blue channels swapped.

This commit aims to address this by swizzling the blue and red channels on texture copies when a BGR format is encountered.
2021-03-04 14:14:19 -05:00
b8b5891585 Merge pull request #5989 from ReinUsesLisp/cmdpool
vk_command_pool: Reduce the command pool size from 4096 to 4
2021-03-04 11:07:31 -08:00
50ee9c46ab video_core: rasterizer_accelerated: Fix delta check ordering. 2021-03-02 17:48:02 -08:00
6ab839462c video_core: rasterizer_accelerated: Improve error handling & fix implicit conversion. 2021-03-02 17:44:02 -08:00
94da1e8a7e video_core: rasterizer_accelerated: Use a flat array instead of interval_map for cached pages.
- Uses a fixed 64MB for the cache instead of an ever growing map.
- Slightly faster by using atomics instead of a single mutex for access.
- Thanks for Rodrigo for the idea.
2021-03-02 16:57:53 -08:00
5ad62e7bfc buffer_cache: Heuristically decide to skip cache on uniform buffers
Some games benefit from skipping caches (Pokémon Sword), and others
don't (Animal Crossing: New Horizons). Add an heuristic to decide this
at runtime.

The cache hit ratio has to be ~98% or better to not skip the cache.
There are 16 frames of buffer.
2021-03-02 02:44:19 -03:00
52e9d7fa49 gpu_thread: Remove Async NVDEC placeholders
This commit removes early placeholders for an implementation of async nvdec. With recent changes to the source code, the placeholders are no longer accurate, and can cause a nullptr dereference due to the nature of the cdma_pusher lifetime.
2021-02-28 22:03:00 -05:00
55f556c53e Merge pull request #5984 from jbeich/gcc-freebsd
common,video-core: unbreak GCC 11 build on FreeBSD 13
2021-02-27 14:15:00 -07:00
09f7c355c6 Merge pull request #5953 from bunnei/memory-refactor-1
Kernel Rework: Memory updates and refactoring (Part 1)
2021-02-27 12:48:35 -07:00
d31dbb1bc1 Implement glDepthRangeIndexeddNV 2021-02-24 22:26:53 +00:00
aae399c1a8 vk_command_pool: Reduce the command pool size from 4096 to 4
This allows drivers to reuse memory more easily and preallocate less.
The optimal number has been measured booting Pokémon Sword.
2021-02-23 19:08:24 -03:00
1841ca4b9b video_core: add missing header after 468bd9c1b0
src/video_core/shader_notify.cpp: In member function 'void VideoCore::ShaderNotify::MarkShaderComplete()':
src/video_core/shader_notify.cpp:33:10: error: 'unique_lock' is not a member of 'std'
   33 |     std::unique_lock lock{mutex};
      |          ^~~~~~~~~~~
src/video_core/shader_notify.cpp:6:1: note: 'std::unique_lock' is defined in header '<mutex>'; did you forget to '#include <mutex>'?
    5 | #include "video_core/shader_notify.h"
  +++ |+#include <mutex>
    6 |
src/video_core/shader_notify.cpp: In member function 'void VideoCore::ShaderNotify::MarkSharderBuilding()':
src/video_core/shader_notify.cpp:38:10: error: 'unique_lock' is not a member of 'std'
   38 |     std::unique_lock lock{mutex};
      |          ^~~~~~~~~~~
src/video_core/shader_notify.cpp:38:10: note: 'std::unique_lock' is defined in header '<mutex>'; did you forget to '#include <mutex>'?
2021-02-23 00:04:36 +00:00
20245e660f Merge pull request #5936 from Kelebek1/Offsets
Offsets for TexelFetch and TextureGather in Vulkan
2021-02-21 21:23:45 -07:00
1a5d4d7840 gl_disk_shader_cache: Log total shader entries count on game load 2021-02-20 11:08:19 -05:00
728ee181eb Merge pull request #5924 from ReinUsesLisp/inline-bindings
vk_update_descriptor: Inline and improve code for binding buffers
2021-02-19 12:27:10 -08:00
93e20867b0 hle: kernel: Migrate PageHeap/PageTable to KPageHeap/KPageTable. 2021-02-18 16:16:25 -08:00
9cae3e6e90 Merge pull request #4973 from ameerj/nvdec-opt
nvdec: Reuse allocated buffers and general cleanup
2021-02-18 15:12:07 -08:00
24d0cc3ab8 vk_rasterizer: Fix loading shader addresses twice
This was recently introduced on a wrongly rebased commit.
2021-02-15 21:34:13 -03:00
cffa6f4e62 Merge pull request #5923 from ReinUsesLisp/vk-dirty-pipeline
fixed_pipeline_cache: Use dirty flags to lazily update key
2021-02-15 13:17:27 -08:00
9d8f793969 Review 1 2021-02-15 05:26:28 +00:00
fb54c38631 Implement texture offset support for TexelFetch and TextureGather and add offsets for Tlds
Formatting
2021-02-15 00:36:37 +00:00
eae9f2e440 yuzu: Various frontend improvements to avoid crashes and improve experience on Linux. 2021-02-14 00:20:41 -08:00
b8ffdbb167 vk_resource_pool: Load GPU tick once and compare with it
Other minor style improvements. Rename free_iterator to hint_iterator,
to describe better what it does.
2021-02-13 17:53:58 -03:00
21b40de318 vk_update_descriptor: Inline and improve code for binding buffers
Allow compilers with our settings inline hot code.
2021-02-13 17:46:24 -03:00
70353649d7 fixed_pipeline_cache: Use dirty flags to lazily update key
Use dirty flags to avoid building pipeline key from scratch on each draw
call. This saves a bit of unnecesary work on each draw call.
2021-02-13 17:44:47 -03:00
c7325c6a4c gl_texture_cache: Lazily create non-sRGB texture views for sRGB formats
This creates non-sRGB texture views for sRGB texture formats to allow for interfacing with these views in compute shaders using imageLoad and imageStore.

Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2021-02-13 13:27:50 -05:00
b675c44e49 rebase, fix name shadowing, more const 2021-02-13 13:07:56 -05:00
3c37d66c28 Address PR feedback
Co-Authored-By: LC <712067+lioncash@users.noreply.github.com>
2021-02-13 13:07:56 -05:00
09722cb4a7 streamline cdma_pusher/command_classes 2021-02-13 13:07:56 -05:00