Alan Hayward
2018-11-07 15:35:23 UTC
I’ve been rethinking the work I’ve been doing for supporting the Aarch64 vector
length changing whilst a process is running, and am considering a different
approach, as my current version has some problems. Below is a summary of the
current state of play, a description of the patches I’ve been writing, the
holes in that approach, and the outline for an alternative version.
Does anyone have any thoughts on this?
Current state of HEAD for Aarch64 SVE:
Upon startup read the current vector length, then create a target description,
where all the registers are a fixed size based off the current vector length.
If vector length changes, then the vector registers will be shown with
incorrect lengths.
The need to support vector lengths changing at runtime:
In the general case, it’s not expected that Aarch64 SVE applications will
change the vector length during runtime. However, Linux does provide an
interface for changing per thread and specific use cases include:
* Specific hardware is known to work faster at a given length for a given loop
* For debugging purposes
* Ultra cautious application restricts the vector length due to the vendor
being unable to test on larger vector lengths.
In addition we should take into account future variable architectures, such
as RISC-V where the vector length is expected to be changed before every
vector loop and the maximum possible size of a register is not fixed by the
architecture spec.
My original variable plan:
“Target descriptions are fixed. Obtain a new target description when the vector
length changes.”
I’ve uploaded a set of patches to the branch users/ahayward/variable_sve2.
When the inferior stops, GDB makes a call to get the gdbarch. By overloading
thread_architecture(), we re-read the vector length and if it has changed, then
get a different target description either by creating a new one or grabbing it
from a cache.
Similar functionality is added to gdbserver.
The problem is, target descriptions are per process not per thread. I’m not
sure what would be needed to change it. Every access to the target description
would now require the thread id? I’m not sure of the impact of this across
large threaded applications.
This also currently causes the regcache to get recreated each switch.
Alternative plan:
“Target descriptions understand variable registers.”
I’ve not yet coded any of this, and it exists just as an idea.
The ability for a register to be variable sized registers is added to target
descriptions.
We know the maximum size of the registers at runtime (by asking the kernel).
Code would create a vector register then pass that in when creating a variable
register:
tdesc_reg *vg_reg = tdesc_create_reg (feature, "vg", regnum++, 1, NULL, 64, "int");
tdesc_create_variable_reg (feature, "z0", regnum++, 1, NULL, 128, "svev”, vg_reg, max_vg_value);
In xml:
<reg name=“z0” bitsize="128" type="aarch64v" variable=“yes” scalar_reg=“vg” maxbitsize=“2048” />
Size of variable reg = min (*scalar_reg * bitsize, maxbitsize)
Similar support needs adding to the register types.
The register cache can use the maximum size of each register for allocating the
cache. This is perfectly fine for SVE (I also suspect most programs will be
running using the maximum length allowed). Reading/writing a register needs to
limit to the current size. Printing/Setting a register will have to read the
associated variable register and then scale accordingly.
Additional work would be required for RISC-V as the maximum register sized is
not fixed; I suspect that is mostly in the regcache handling.
As a side note, it's probably worth mentioning SVE ACLE ( Currently in beta, see
https://static.docs.arm.com/100987/0000/acle_sve_100987_0000_00_en.pdf ).
These are C extensions which add variable size types such as svint32_t (a vector
containing 32bit integers). There are clear uses cases for a user writing code
using this and I suspect it would require extending the type system in GDB to
allow for variable sized types. This would be similar to the changes made to GCC
to support compiling to SVE, which was a large piece of work. There is no support
(yet) for SVE ACLE in GCC, but it's something to think ab
length changing whilst a process is running, and am considering a different
approach, as my current version has some problems. Below is a summary of the
current state of play, a description of the patches I’ve been writing, the
holes in that approach, and the outline for an alternative version.
Does anyone have any thoughts on this?
Current state of HEAD for Aarch64 SVE:
Upon startup read the current vector length, then create a target description,
where all the registers are a fixed size based off the current vector length.
If vector length changes, then the vector registers will be shown with
incorrect lengths.
The need to support vector lengths changing at runtime:
In the general case, it’s not expected that Aarch64 SVE applications will
change the vector length during runtime. However, Linux does provide an
interface for changing per thread and specific use cases include:
* Specific hardware is known to work faster at a given length for a given loop
* For debugging purposes
* Ultra cautious application restricts the vector length due to the vendor
being unable to test on larger vector lengths.
In addition we should take into account future variable architectures, such
as RISC-V where the vector length is expected to be changed before every
vector loop and the maximum possible size of a register is not fixed by the
architecture spec.
My original variable plan:
“Target descriptions are fixed. Obtain a new target description when the vector
length changes.”
I’ve uploaded a set of patches to the branch users/ahayward/variable_sve2.
When the inferior stops, GDB makes a call to get the gdbarch. By overloading
thread_architecture(), we re-read the vector length and if it has changed, then
get a different target description either by creating a new one or grabbing it
from a cache.
Similar functionality is added to gdbserver.
The problem is, target descriptions are per process not per thread. I’m not
sure what would be needed to change it. Every access to the target description
would now require the thread id? I’m not sure of the impact of this across
large threaded applications.
This also currently causes the regcache to get recreated each switch.
Alternative plan:
“Target descriptions understand variable registers.”
I’ve not yet coded any of this, and it exists just as an idea.
The ability for a register to be variable sized registers is added to target
descriptions.
We know the maximum size of the registers at runtime (by asking the kernel).
Code would create a vector register then pass that in when creating a variable
register:
tdesc_reg *vg_reg = tdesc_create_reg (feature, "vg", regnum++, 1, NULL, 64, "int");
tdesc_create_variable_reg (feature, "z0", regnum++, 1, NULL, 128, "svev”, vg_reg, max_vg_value);
In xml:
<reg name=“z0” bitsize="128" type="aarch64v" variable=“yes” scalar_reg=“vg” maxbitsize=“2048” />
Size of variable reg = min (*scalar_reg * bitsize, maxbitsize)
Similar support needs adding to the register types.
The register cache can use the maximum size of each register for allocating the
cache. This is perfectly fine for SVE (I also suspect most programs will be
running using the maximum length allowed). Reading/writing a register needs to
limit to the current size. Printing/Setting a register will have to read the
associated variable register and then scale accordingly.
Additional work would be required for RISC-V as the maximum register sized is
not fixed; I suspect that is mostly in the regcache handling.
As a side note, it's probably worth mentioning SVE ACLE ( Currently in beta, see
https://static.docs.arm.com/100987/0000/acle_sve_100987_0000_00_en.pdf ).
These are C extensions which add variable size types such as svint32_t (a vector
containing 32bit integers). There are clear uses cases for a user writing code
using this and I suspect it would require extending the type system in GDB to
allow for variable sized types. This would be similar to the changes made to GCC
to support compiling to SVE, which was a large piece of work. There is no support
(yet) for SVE ACLE in GCC, but it's something to think ab