Discussion:
[RFC] Support for changing vector lengths
Alan Hayward
2018-11-07 15:35:23 UTC
Permalink
I’ve been rethinking the work I’ve been doing for supporting the Aarch64 vector
length changing whilst a process is running, and am considering a different
approach, as my current version has some problems. Below is a summary of the
current state of play, a description of the patches I’ve been writing, the
holes in that approach, and the outline for an alternative version.
Does anyone have any thoughts on this?


Current state of HEAD for Aarch64 SVE:
Upon startup read the current vector length, then create a target description,
where all the registers are a fixed size based off the current vector length.
If vector length changes, then the vector registers will be shown with
incorrect lengths.


The need to support vector lengths changing at runtime:
In the general case, it’s not expected that Aarch64 SVE applications will
change the vector length during runtime. However, Linux does provide an
interface for changing per thread and specific use cases include:
* Specific hardware is known to work faster at a given length for a given loop
* For debugging purposes
* Ultra cautious application restricts the vector length due to the vendor
being unable to test on larger vector lengths.
In addition we should take into account future variable architectures, such
as RISC-V where the vector length is expected to be changed before every
vector loop and the maximum possible size of a register is not fixed by the
architecture spec.


My original variable plan:
“Target descriptions are fixed. Obtain a new target description when the vector
length changes.”

I’ve uploaded a set of patches to the branch users/ahayward/variable_sve2.
When the inferior stops, GDB makes a call to get the gdbarch. By overloading
thread_architecture(), we re-read the vector length and if it has changed, then
get a different target description either by creating a new one or grabbing it
from a cache.
Similar functionality is added to gdbserver.
The problem is, target descriptions are per process not per thread. I’m not
sure what would be needed to change it. Every access to the target description
would now require the thread id? I’m not sure of the impact of this across
large threaded applications.
This also currently causes the regcache to get recreated each switch.


Alternative plan:
“Target descriptions understand variable registers.”

I’ve not yet coded any of this, and it exists just as an idea.
The ability for a register to be variable sized registers is added to target
descriptions.
We know the maximum size of the registers at runtime (by asking the kernel).
Code would create a vector register then pass that in when creating a variable
register:
tdesc_reg *vg_reg = tdesc_create_reg (feature, "vg", regnum++, 1, NULL, 64, "int");
tdesc_create_variable_reg (feature, "z0", regnum++, 1, NULL, 128, "svev”, vg_reg, max_vg_value);
In xml:
<reg name=“z0” bitsize="128" type="aarch64v" variable=“yes” scalar_reg=“vg” maxbitsize=“2048” />
Size of variable reg = min (*scalar_reg * bitsize, maxbitsize)
Similar support needs adding to the register types.

The register cache can use the maximum size of each register for allocating the
cache. This is perfectly fine for SVE (I also suspect most programs will be
running using the maximum length allowed). Reading/writing a register needs to
limit to the current size. Printing/Setting a register will have to read the
associated variable register and then scale accordingly.
Additional work would be required for RISC-V as the maximum register sized is
not fixed; I suspect that is mostly in the regcache handling.


As a side note, it's probably worth mentioning SVE ACLE ( Currently in beta, see
https://static.docs.arm.com/100987/0000/acle_sve_100987_0000_00_en.pdf ).
These are C extensions which add variable size types such as svint32_t (a vector
containing 32bit integers). There are clear uses cases for a user writing code
using this and I suspect it would require extending the type system in GDB to
allow for variable sized types. This would be similar to the changes made to GCC
to support compiling to SVE, which was a large piece of work. There is no support
(yet) for SVE ACLE in GCC, but it's something to think ab
Alan Hayward
2018-12-05 17:23:17 UTC
Permalink
Ping. Anyone have an opinion on this?

(Not expecting a review on the variable_sve2 branch at this point - just
If it’s worth continuing in that direction or worth switching over to the
xml version).

Alan.
Post by Alan Hayward
I’ve been rethinking the work I’ve been doing for supporting the Aarch64 vector
length changing whilst a process is running, and am considering a different
approach, as my current version has some problems. Below is a summary of the
current state of play, a description of the patches I’ve been writing, the
holes in that approach, and the outline for an alternative version.
Does anyone have any thoughts on this?
Upon startup read the current vector length, then create a target description,
where all the registers are a fixed size based off the current vector length.
If vector length changes, then the vector registers will be shown with
incorrect lengths.
In the general case, it’s not expected that Aarch64 SVE applications will
change the vector length during runtime. However, Linux does provide an
* Specific hardware is known to work faster at a given length for a given loop
* For debugging purposes
* Ultra cautious application restricts the vector length due to the vendor
being unable to test on larger vector lengths.
In addition we should take into account future variable architectures, such
as RISC-V where the vector length is expected to be changed before every
vector loop and the maximum possible size of a register is not fixed by the
architecture spec.
“Target descriptions are fixed. Obtain a new target description when the vector
length changes.”
I’ve uploaded a set of patches to the branch users/ahayward/variable_sve2.
When the inferior stops, GDB makes a call to get the gdbarch. By overloading
thread_architecture(), we re-read the vector length and if it has changed, then
get a different target description either by creating a new one or grabbing it
from a cache.
Similar functionality is added to gdbserver.
The problem is, target descriptions are per process not per thread. I’m not
sure what would be needed to change it. Every access to the target description
would now require the thread id? I’m not sure of the impact of this across
large threaded applications.
This also currently causes the regcache to get recreated each switch.
“Target descriptions understand variable registers.”
I’ve not yet coded any of this, and it exists just as an idea.
The ability for a register to be variable sized registers is added to target
descriptions.
We know the maximum size of the registers at runtime (by asking the kernel).
Code would create a vector register then pass that in when creating a variable
tdesc_reg *vg_reg = tdesc_create_reg (feature, "vg", regnum++, 1, NULL, 64, "int");
tdesc_create_variable_reg (feature, "z0", regnum++, 1, NULL, 128, "svev”, vg_reg, max_vg_value);
<reg name=“z0” bitsize="128" type="aarch64v" variable=“yes” scalar_reg=“vg” maxbitsize=“2048” />
Size of variable reg = min (*scalar_reg * bitsize, maxbitsize)
Similar support needs adding to the register types.
The register cache can use the maximum size of each register for allocating the
cache. This is perfectly fine for SVE (I also suspect most programs will be
running using the maximum length allowed). Reading/writing a register needs to
limit to the current size. Printing/Setting a register will have to read the
associated variable register and then scale accordingly.
Additional work would be required for RISC-V as the maximum register sized is
not fixed; I suspect that is mostly in the regcache handling.
As a side note, it's probably worth mentioning SVE ACLE ( Currently in beta, see
https://static.docs.arm.com/100987/0000/acle_sve_100987_0000_00_en.pdf ).
These are C extensions which add variable size types such as svint32_t (a vector
containing 32bit integers). There are clear uses cases for a user writing code
using this and I suspect it would require extending the type system in GDB to
allow for variable sized types. This would be similar to the changes made to GCC
to support compiling to SVE, which was a large piece of work. There is no support
(yet) for SVE ACLE in GCC, but it's something to think abo
John Baldwin
2018-12-05 18:42:36 UTC
Permalink
I think in "bikeshed" parlance this is an atomic power-plant (see bikeshed.org
if needed). I don't feel super qualified, but my gut feeling is that having
variable-sized registers probably "scales" better in the long run vs having
separate target descriptions. However, it also seems like it will be more
work. It does feel like we want different "views" of registers, and it
reminds even of when you are debugging a 32-bit process on a 64-bit host where
you have 64-bit registers, but you are using the compatibility 32-bit view
of those same registers. I don't know if the variable sized registers could
be useful for making how we deal with those (e.g. "eax" vs "rax" on x86), or
if the way we deal with those might also inform how to treat SVE?
Post by Alan Hayward
Ping. Anyone have an opinion on this?
(Not expecting a review on the variable_sve2 branch at this point - just
If it’s worth continuing in that direction or worth switching over to the
xml version).
Alan.
Post by Alan Hayward
I’ve been rethinking the work I’ve been doing for supporting the Aarch64 vector
length changing whilst a process is running, and am considering a different
approach, as my current version has some problems. Below is a summary of the
current state of play, a description of the patches I’ve been writing, the
holes in that approach, and the outline for an alternative version.
Does anyone have any thoughts on this?
Upon startup read the current vector length, then create a target description,
where all the registers are a fixed size based off the current vector length.
If vector length changes, then the vector registers will be shown with
incorrect lengths.
In the general case, it’s not expected that Aarch64 SVE applications will
change the vector length during runtime. However, Linux does provide an
* Specific hardware is known to work faster at a given length for a given loop
* For debugging purposes
* Ultra cautious application restricts the vector length due to the vendor
being unable to test on larger vector lengths.
In addition we should take into account future variable architectures, such
as RISC-V where the vector length is expected to be changed before every
vector loop and the maximum possible size of a register is not fixed by the
architecture spec.
“Target descriptions are fixed. Obtain a new target description when the vector
length changes.”
I’ve uploaded a set of patches to the branch users/ahayward/variable_sve2.
When the inferior stops, GDB makes a call to get the gdbarch. By overloading
thread_architecture(), we re-read the vector length and if it has changed, then
get a different target description either by creating a new one or grabbing it
from a cache.
Similar functionality is added to gdbserver.
The problem is, target descriptions are per process not per thread. I’m not
sure what would be needed to change it. Every access to the target description
would now require the thread id? I’m not sure of the impact of this across
large threaded applications.
This also currently causes the regcache to get recreated each switch.
“Target descriptions understand variable registers.”
I’ve not yet coded any of this, and it exists just as an idea.
The ability for a register to be variable sized registers is added to target
descriptions.
We know the maximum size of the registers at runtime (by asking the kernel).
Code would create a vector register then pass that in when creating a variable
tdesc_reg *vg_reg = tdesc_create_reg (feature, "vg", regnum++, 1, NULL, 64, "int");
tdesc_create_variable_reg (feature, "z0", regnum++, 1, NULL, 128, "svev”, vg_reg, max_vg_value);
<reg name=“z0” bitsize="128" type="aarch64v" variable=“yes” scalar_reg=“vg” maxbitsize=“2048” />
Size of variable reg = min (*scalar_reg * bitsize, maxbitsize)
Similar support needs adding to the register types.
The register cache can use the maximum size of each register for allocating the
cache. This is perfectly fine for SVE (I also suspect most programs will be
running using the maximum length allowed). Reading/writing a register needs to
limit to the current size. Printing/Setting a register will have to read the
associated variable register and then scale accordingly.
Additional work would be required for RISC-V as the maximum register sized is
not fixed; I suspect that is mostly in the regcache handling.
As a side note, it's probably worth mentioning SVE ACLE ( Currently in beta, see
https://static.docs.arm.com/100987/0000/acle_sve_100987_0000_00_en.pdf ).
These are C extensions which add variable size types such as svint32_t (a vector
containing 32bit integers). There are clear uses cases for a user writing code
using this and I suspect it would require extending the type system in GDB to
allow for variable sized types. This would be similar to the changes made to GCC
to support compiling to SVE, which was a large piece of work. There is no support
(yet) for SVE ACLE in GCC, but it's something to think about.
Thanks,
Alan.
--
John Baldwin

                                                                            
Alan Hayward
2018-12-07 18:43:27 UTC
Permalink
Thanks for the response. I think maybe my best option is to see if I can get
mock together a quick version of the views version. Agree that it probably
scales better in the long run.

I’ll have a look at the rax/eax code see if there’s anything I can steal or
reuse, not quite sure what is there right now.

Alan.
Post by John Baldwin
I think in "bikeshed" parlance this is an atomic power-plant (see bikeshed.org
if needed). I don't feel super qualified, but my gut feeling is that having
variable-sized registers probably "scales" better in the long run vs having
separate target descriptions. However, it also seems like it will be more
work. It does feel like we want different "views" of registers, and it
reminds even of when you are debugging a 32-bit process on a 64-bit host where
you have 64-bit registers, but you are using the compatibility 32-bit view
of those same registers. I don't know if the variable sized registers could
be useful for making how we deal with those (e.g. "eax" vs "rax" on x86), or
if the way we deal with those might also inform how to treat SVE?
Post by Alan Hayward
Ping. Anyone have an opinion on this?
(Not expecting a review on the variable_sve2 branch at this point - just
If it’s worth continuing in that direction or worth switching over to the
xml version).
Alan.
Post by Alan Hayward
I’ve been rethinking the work I’ve been doing for supporting the Aarch64 vector
length changing whilst a process is running, and am considering a different
approach, as my current version has some problems. Below is a summary of the
current state of play, a description of the patches I’ve been writing, the
holes in that approach, and the outline for an alternative version.
Does anyone have any thoughts on this?
Upon startup read the current vector length, then create a target description,
where all the registers are a fixed size based off the current vector length.
If vector length changes, then the vector registers will be shown with
incorrect lengths.
In the general case, it’s not expected that Aarch64 SVE applications will
change the vector length during runtime. However, Linux does provide an
* Specific hardware is known to work faster at a given length for a given loop
* For debugging purposes
* Ultra cautious application restricts the vector length due to the vendor
being unable to test on larger vector lengths.
In addition we should take into account future variable architectures, such
as RISC-V where the vector length is expected to be changed before every
vector loop and the maximum possible size of a register is not fixed by the
architecture spec.
“Target descriptions are fixed. Obtain a new target description when the vector
length changes.”
I’ve uploaded a set of patches to the branch users/ahayward/variable_sve2.
When the inferior stops, GDB makes a call to get the gdbarch. By overloading
thread_architecture(), we re-read the vector length and if it has changed, then
get a different target description either by creating a new one or grabbing it
from a cache.
Similar functionality is added to gdbserver.
The problem is, target descriptions are per process not per thread. I’m not
sure what would be needed to change it. Every access to the target description
would now require the thread id? I’m not sure of the impact of this across
large threaded applications.
This also currently causes the regcache to get recreated each switch.
“Target descriptions understand variable registers.”
I’ve not yet coded any of this, and it exists just as an idea.
The ability for a register to be variable sized registers is added to target
descriptions.
We know the maximum size of the registers at runtime (by asking the kernel).
Code would create a vector register then pass that in when creating a variable
tdesc_reg *vg_reg = tdesc_create_reg (feature, "vg", regnum++, 1, NULL, 64, "int");
tdesc_create_variable_reg (feature, "z0", regnum++, 1, NULL, 128, "svev”, vg_reg, max_vg_value);
<reg name=“z0” bitsize="128" type="aarch64v" variable=“yes” scalar_reg=“vg” maxbitsize=“2048” />
Size of variable reg = min (*scalar_reg * bitsize, maxbitsize)
Similar support needs adding to the register types.
The register cache can use the maximum size of each register for allocating the
cache. This is perfectly fine for SVE (I also suspect most programs will be
running using the maximum length allowed). Reading/writing a register needs to
limit to the current size. Printing/Setting a register will have to read the
associated variable register and then scale accordingly.
Additional work would be required for RISC-V as the maximum register sized is
not fixed; I suspect that is mostly in the regcache handling.
As a side note, it's probably worth mentioning SVE ACLE ( Currently in beta, see
https://static.docs.arm.com/100987/0000/acle_sve_100987_0000_00_en.pdf ).
These are C extensions which add variable size types such as svint32_t (a vector
containing 32bit integers). There are clear uses cases for a user writing code
using this and I suspect it would require extending the type system in GDB to
allow for variable sized types. This would be similar to the changes made to GCC
to support compiling to SVE, which was a large piece of work. There is no support
(yet) for SVE ACLE in GCC, but it's something to think about.
Thanks,
Alan.
--
John Baldwin
Continue reading on narkive:
Loading...