PCSX2 Git (2016/12/22) is compiled. PCSX2 is an open source PlayStation 2 (PS2) emulator for the Microsoft Windows and Linux operating systems. With the most recent versions, many PS2 games are playable (although speed limitations have made play-to-completion tests for many games impractical), and several games are claimed to have full functionality.
PCSX2 Git Changelog:
* GSDX: Cleanup warnings on MSVC (#1694)
Explicitly cast some bitfields/local loop variables to uint8 as these functions have uint8 as the parameter datatype.
* Merge pull request #1706 from PCSX2/greg/vif-hash
Greg/vif hash
* vif: update alignment constraint
16B alignment is now useless for nVifBlock (no more SSE)
However update the alignment of bucket to 64B. It will reduce cache miss
probability in the find loop
* vif: use u32 code instead of u8/u16
It avoids memory stalls and greatly reduces the overhead of the dVifUnpack function
Here a vtune summary of this branch (done on SotC init)
dVifUnpack<1> was 14.5% of effective VU thread time
dVifUnpack<1> is now 3.8% of effective VU thread time
I hope it will translate to better fps
* vif: move back the cache seach in the unpack function
Avoid the various move to return the value (actually due to the pointer)
* vif: inline dVifsetVUptr function
It avoid a double cmp/jmp on the dynarec/interpreter mode.
* vif: compute the length during the compilation stage
* vif: replace sse cmp code with standard cmp
Standard instruction are faster to execute besides the CPU can optimize the cmp/jne
SSE
v2: use reference instead of a pointer for find parameter
* vif: increase buckets number to 64K
It allow to compare only 8B in the lookup so SSE could be replaced with general instruction
As a bonus, it allow to compute the hash key with a mov rather than modulo (which was an 'and')
* vif: repack nVifBlock struct
cl/wl can fit in a single byte. Add a 2B length field instead.
It will contains the pre computed length to reduce dVifsetVUptr overhead
* vif: handle the special case 0 in the compilation stage (rather than lookup)
* vif: reorganize dVifUnpack
Inline the execution part
Add a num parameter to dVifsetVUptr
Use a local variable for the nVifBlock instead of a global struct state
The goal is to ease future update of the nVifBlock struct
* vif: new implementation of the hash bucket
Previous implementation saved the both the chain pointer and the chain size
Rational: size is useful to add new element and to detect the end of the chain
Vif cache is rarely miss. So 'add' is barely called and the end of a chain is
barely reached.
New implementation will add a null cell at the end of the chain. As a
cell contains a x86 pointer, if is null you could conclude that you
reach the end of the chain.
The 'add' function will traverse the chain to get the current size. It is
a cold path besides the chain is often short (< 4).
The 'find' function only need to check the startPtr bytes to detect the end
of the loop.
Note: SizeChain was replaced with a std::array
* vif: remove the type template of HashBucket
The class is designed and optimized for the layout of nVifBlock.
Besides it will ease future improvement.
* vif: use intrinsic cast instead of ugly define
* vif: don't allocate vifblock hash on the heap
Avoid an extra indirection to access the hash bucket (Find function)
* vif: improve block compilation management
Safety:
* check remaining space before compilation
* clear hash if recompiler is reset
Perf:
* don't research the hash after a miss
* reduce branching in Unpack/ExecuteUnpack
Note: a potential speed optimization for dVifsetVUptr
Precompute the length and store in the cache. However it need 2B on the
nVifBlock struct. Maybe we can compact cl/wl. Or merge aligned with upkType
(if some bits are useless)
* vif: remove useless state from nVifStruct
Download: PCSX2 Git (2016/12/22)
Source: Here
0 Comments
Post a Comment