I've posted about this work several times now, including:
ht
tp://sourceware.org/ml/gdb/2005-05/msg00074.html
ht
tp://sourceware.org/ml/gdb/2005-05/msg00171.html
ht
tp://sourceware.org/ml/gdb/2006-01/msg00257.html
ht
tp://sourceware.org/ml/gdb/2006-03/msg00031.html
It is in a much more concrete stage of development than
it's ever been
before; almost everything described in the proposed
documentation
actually works now, and I am generally happy with the format
of the descriptions. You can find the code on
gdb-csl-available-20060303-branch in the CVS repository.
It's
currently only wired up for ARM, and there aren't any
sample stub
implementations - that'll be along.
I'd particularly like to thank Paul Brook for some valuable
suggestions, and Jim Blandy for both suggestions and turning
my messy
notes into the coherent Texinfo below (and for using
"Self-Describing",
which I think is the phrase I'd been fumbling around for a
while).
I would appreciate comments on the sample and documentation;
while
there are a lot of things left on my to-do list for this
project,
most of them are nice to have rather than important.
What's in CVS
is enough to be very, very useful. If other GDB developers
like
the path it's taking, I'd prefer to do future development
on it
in mainline, instead of on a branch.
Here's a small, but useful, sample description. Then,
below it, Jim's
documentation - which includes details on what the
description means.
This description matches the current layout of the ARM
register cache.
In target.xml:
<?xml version="1.0"?>
<!DOCTYPE target SYSTEM "gdb-target.dtd">
<target>
<xi:include href="arm-core.xml"/>
<xi:include href="arm-fpa.xml"/>
<feature-set>
<feature-ref name="org.gnu.gdb.arm.core"
base-regnum="0"/>
<feature-ref name="org.gnu.gdb.arm.fpa"
base-regnum="16"/>
</feature-set>
</target>
In arm-core.xml:
<?xml version="1.0"?>
<!DOCTYPE feature SYSTEM "gdb-target.dtd">
<feature name="org.gnu.gdb.arm.core">
<reg name="r0"
bitsize="32"/>
<reg name="r1"
bitsize="32"/>
<reg name="r2"
bitsize="32"/>
<reg name="r3"
bitsize="32"/>
<reg name="r4"
bitsize="32"/>
<reg name="r5"
bitsize="32"/>
<reg name="r6"
bitsize="32"/>
<reg name="r7"
bitsize="32"/>
<reg name="r8"
bitsize="32"/>
<reg name="r9"
bitsize="32"/>
<reg name="r10"
bitsize="32"/>
<reg name="r11"
bitsize="32"/>
<reg name="r12"
bitsize="32"/>
<reg name="r13"
bitsize="32"/>
<reg name="r14"
bitsize="32"/>
<reg name="r15"
bitsize="32"/>
<!-- The CPSR is register 25, rather than register 16,
because
the FPA registers historically were placed between
the PC
and the CPSR in the "g" packet. -->
<reg name="cpsr" bitsize="32"
regnum="25"/>
</feature>
In arm-fpa.xml:
<?xml version="1.0"?>
<!DOCTYPE feature SYSTEM "gdb-target.dtd">
<feature name="org.gnu.gdb.arm.fpa">
<reg name="f0" bitsize="96"
type="float"/>
<reg name="f1" bitsize="96"
type="float"/>
<reg name="f2" bitsize="96"
type="float"/>
<reg name="f3" bitsize="96"
type="float"/>
<reg name="f4" bitsize="96"
type="float"/>
<reg name="f5" bitsize="96"
type="float"/>
<reg name="f6" bitsize="96"
type="float"/>
<reg name="f7" bitsize="96"
type="float"/>
<reg name="fps"
bitsize="32"/>
</feature>
The documentation:
Appendix F Self-Describing Targets
**********************************
One of the challenges of using GDB to debug embedded systems
is that
there are so many minor variants of each processor
architecture in use.
It is common practice for vendors to start with a standard
processor
core -- ARM, PowerPC, or MIPS, for example -- and then make
changes to
adapt it to a particular market niche. Some architectures
have
hundreds of variants, available from dozens of vendors.
This leads to
a number of problems:
* With so many different customized processors, it is
difficult for
the GDB maintainers to keep up with the changes.
* Since individual variants may have short lifetimes or
limited
audiences, it may not be worthwhile to carry
information about
every variant in the GDB source tree.
* When GDB does support the architecture of the embedded
system at
hand, the task of finding the correct architecture name
to give the
`set architecture' command can be error-prone.
To address these problems, the GDB remote protocol allows
a target
system to not only identify itself to GDB, but to actually
describe its
own features. This lets GDB support processor variants it
has never
seen before -- to the extent that the descriptions are
accurate, and
that GDB understands them.
F.1 Retrieving Self-Descriptions
================================
GDB retrieves a target's self-description via the remote
protocol using
a `qPart' request (*note the `qPart' request: qPart
request.) of the
form:
qPart:features:read:ANNEX:OFFSET,LENGTH
where ANNEX is the string `target.xml'. The OFFSET and
LENGTH
parameters are the offset into the description and the
number of bytes
to transfer, as for other `qPart' requests.
The `target.xml' annex contains an XML document
describing the
target's features; its form is described in *Note
Self-Description
Format::.
Feature descriptions may be split into several annexes,
which GDB
retrieves and assembles into a complete description. An
annex may use
XML Inclusions (http://www.w3.org/TR/x
include/) to incorporate other
annexes, much as a C header file refers to other headers
using
`#include'. GDB first retrieves `target.xml', and then
makes further
`qPart' requests as needed to retrieve the annexes referred
to by any
`xi:include' elements it finds. Naturally, annexes brought
in by
`xi:include' may use `xi:include' themselves.
To reduce protocol overhead, a target may supply a
special annex
named `CHECKSUMS' that provides 160-bit SHA1 checksum
values for the
annexes it has available. The `CHECKSUMS' annex contains a
series of
newline-terminated lines, each of which contains a 40-digit
hexidecimal
checksum, two spaces, and the name of an annex with the
given checksum.
Here is an example `CHECKSUM' annex:
68c94ffc34f8ad2d7bfae3f5a6b996409211c1b1 target.xml
0e8e850b0580fbaaa0872326cb1b8ad6adda9b0d mmu.xml
00f22e5f971ccec05c2acce98caf8cff4343c8cf fpu.xml
GDB uses these checksums to avoid retrieving a given
annex more than
once. When GDB retrieves an annex, it caches its contents
locally.
Then, each time GDB thinks the target architecture may have
changed
(say, after making a new remote protocol connection, or
after starting
a new child process using the extended remote protocol), it
retrieves
the `CHECKSUMS' annex afresh. If the checksums show that a
particular
annex's contents are the same on the target and in GDB's
cache, GDB
avoids fetching it again. If none of the annexes have
changed, GDB
needs only retrieve the `CHECKSUMS' annex.
`CHECKSUMS' need not provide a checksum for every annex
available;
if a given annex is not mentioned, GDB will try to retrieve
it each
time it thinks the target architecture may have changed.
The target
need not provide any `CHECKSUMS' annex at all; this is
equivalent to an
empty `CHECKSUMS' annex.
F.2 Self-Description Format
===========================
A target description annex is an XML (http://www.w3.org/XML/)
document
which complies with the Document Type Definition provided in
the GDB
sources in `gdb/features/gdb-target.dtd'. This means you
can use
generally available tools like `xmllint' to check that your
feature
descriptions are well-formed and valid. However, to help
people
unfamiliar with XML write descriptions for their targets, we
also
describe the grammar here.
At the moment, target descriptions can only describe
register sets,
to be accessed via the remote protocol `g', `G', `p' and
`P' requests.
We hope to extend the format to include other kinds of
information,
like memory maps.
Here is a simple sample target description:
<?xml version="1.0"?>
<!DOCTYPE target SYSTEM
"gdb-target.dtd">
<target>
<feature name="bar">
<reg name="s0"
bitsize="32"/>
<reg name="s1"
bitsize="32" type="float"/>
</feature>
<feature-set>
<feature-ref name="bar"
base-regnum="1"/>
</feature-set>
</target>
This describes a simple target feature set which only
contains two
registers, named `s0' (a 32-bit integer register) and `s1'
(a 32-bit
floating point register).
A target description has the overall form:
<?xml version="1.0"?>
<!DOCTYPE target SYSTEM
"gdb-target.dtd">
<target>
FEATURE...
FEATURE-SET
</target>
The description is generally insensitive to whitespace
and line
breaks, under the usual common-sense rules. The ellipsis
(`...') after
FEATURE indicates that FEATURE may appear zero or more
times.
Each FEATURE names and describes a single feature of the
target; at
the moment, features can only describe register sets. The
FEATURE-SET
cites particular features by name, pulling together a
complete
description of the target. A FEATURE has the form:
<feature name="NAME">
REG...
</feature>
This defines a feature named NAME; each feature's name
must be
unique across the description.
Each REG has the form:
<reg name="NAME"
bitsize="SIZE"
[regnum="NUM"]
[readonly="READ-ONLY"]
[save-restore="SAVE-RESTORE"]
[type="TYPE"]
[group="GROUP"]/>
Items in [brackets] are optional. The components are as
follows:
NAME
The register's name; it must be unique within the
target
description.
BITSIZE
The register's size, in bits.
REGNUM
The register's number. If omitted, a register's
number is one
greater than that of the previous register; the first
register's
number defaults to zero. But also see the
`feature-ref' element's
`base-regnum' attribute, below--these register numbers
are relative
to the `base-regnum'.
READONLY
Whether the register is read-only or not; this must be
either
`yes' or `no'. The default is `no'.
SAVE-RESTORE
Whether the register should be preserved across
inferior function
calls; this must be either `yes' or `no'. The
default is `yes'.
TYPE
The type of the register. At the moment, TYPE must be
either
`int' or `float'. The default is `int'.
GROUP
The register group to which this register belongs. At
the moment,
GROUP must be either `general', `float', or
`vector'. If no GROUP
is specified, GDB will select a register group based on
the
register's type.
A FEATURE-SET binds together a set of features to
describe a
complete target. There can be only one FEATURE-SET in a
target. Each
FEATURE-SET has the form:
<feature-set>
FEATURE-REF...
</feature-set>
where each FEATURE-REF has the form:
<feature-ref name="NAME"
[base-regnum="N"]/>
This means that the target includes the feature named
NAME. If the
`base-regnum' is present, that means that registers in the
given
feature are numbered starting with N, until overridden by an
explicit
register number.
It can sometimes be valuable to split a target
description up into
several different annexes, either for organizational
purposes, or to
allow GDB to cache portions of the description that change
rarely. To
make this possible, you can replace any feature description
with an
inclusion directive of the form:
<xi:include href="ANNEX"/>
When GDB encounters an element of this form, it will
retrieve the
annex named ANNEX (or use its cached copy), and replace the
inclusion
directive with the contents of that annex.
--
Daniel Jacobowitz
CodeSourcery
|
> Date: Wed, 29 Mar 2006 11:16:25 -0500
> From: Daniel Jacobowitz <drow false.org>
>
> I would appreciate comments on the sample and
documentation
Comments on the documentation are below. Note that I needed
to guess
what was in the Texinfo source, since you posted the Info
output, so I
could have guessed wrong, and my comments might thus be off
the target.
> GDB retrieves a target's self-description via the
remote protocol using
> a `qPart' request (*note the `qPart' request: qPart
request.) of the form:
This cross-reference looks awkward. I'm guessing that Jim
used a
2-argument form of a pxref here. But the second arg is
redundant
here because it is a substring of the 1st. Am I missing
some valid
reason for using the second argument?
> qPart:features:read:ANNEX:OFFSET,LENGTH
> where ANNEX is the string `target.xml'. The OFFSET
and LENGTH
The last line should have a noindent before it.
> parameters are the offset into the description and the
number of bytes
> to transfer, as for other `qPart' requests.
>
> The `target.xml' annex contains an XML document
describing the
> target's features; its form is described in *Note
Self-Description
> Format::.
There's something I don't understand here: is
"target.xml" a literal
fixed string that will _always_ appear in the above packet?
If it is,
why do we need to mention its name?
> To reduce protocol overhead, a target may supply a
special annex
> named `CHECKSUMS' that provides 160-bit SHA1 checksum
values for the
> annexes it has available. The `CHECKSUMS' annex
contains a series of
> newline-terminated lines, each of which contains a
40-digit hexidecimal
> checksum, two spaces, and the name of an annex with the
given checksum.
> Here is an example `CHECKSUM' annex:
> 68c94ffc34f8ad2d7bfae3f5a6b996409211c1b1
target.xml
> 0e8e850b0580fbaaa0872326cb1b8ad6adda9b0d mmu.xml
> 00f22e5f971ccec05c2acce98caf8cff4343c8cf fpu.xml
Shouldn't we document how to generate a checksum for a
file?
Other than that, looks fine to me.
|
On 3/29/06, Eli Zaretskii <eliz gnu.org> wrote:
> > Date: Wed, 29 Mar 2006 11:16:25 -0500
> > From: Daniel Jacobowitz <drow false.org>
> >
> > I would appreciate comments on the sample and
documentation
>
> Comments on the documentation are below. Note that I
needed to guess
> what was in the Texinfo source, since you posted the
Info output, so I
> could have guessed wrong, and my comments might thus be
off the target.
Thanks very much. We wanted to get comments on the actual
design
itself, so we posted legible text instead of a texinfo
patch. When it
comes down to posting the final patch, we'll certainly
include the
patch to gdb.texinfo in the usual way.
> > GDB retrieves a target's self-description via the
remote protocol using
> > a `qPart' request (*note the `qPart' request:
qPart request.) of the form:
>
> This cross-reference looks awkward. I'm guessing that
Jim used a
> 2-argument form of a pxref here. But the second
arg is redundant
> here because it is a substring of the 1st. Am I
missing some valid
> reason for using the second argument?
It does look awkward. I wanted to use code for
qPart. But I'm not
sure it's worth it; I've simplified it and it seems okay.
> > qPart:features:read:ANNEX:OFFSET,LENGTH
> > where ANNEX is the string `target.xml'. The
OFFSET and LENGTH
>
> The last line should have a noindent before it.
Done.
> > parameters are the offset into the description and
the number of bytes
> > to transfer, as for other `qPart' requests.
> >
> > The `target.xml' annex contains an XML
document describing the
> > target's features; its form is described in *Note
Self-Description
> > Format::.
>
> There's something I don't understand here: is
"target.xml" a literal
> fixed string that will _always_ appear in the above
packet? If it is,
> why do we need to mention its name?
We talk about GDB retrieving other annexes (annices?) later;
that
request is used for all of them. I'll try to rephrase
this.
> > To reduce protocol overhead, a target may
supply a special annex
> > named `CHECKSUMS' that provides 160-bit SHA1
checksum values for the
> > annexes it has available. The `CHECKSUMS' annex
contains a series of
> > newline-terminated lines, each of which contains a
40-digit hexidecimal
> > checksum, two spaces, and the name of an annex
with the given checksum.
> > Here is an example `CHECKSUM' annex:
> > 68c94ffc34f8ad2d7bfae3f5a6b996409211c1b1
target.xml
> > 0e8e850b0580fbaaa0872326cb1b8ad6adda9b0d
mmu.xml
> > 00f22e5f971ccec05c2acce98caf8cff4343c8cf
fpu.xml
>
> Shouldn't we document how to generate a checksum for a
file?
SHA-1 is the name of the specific hash function that must be
used.
I'll clear this up.
> Other than that, looks fine to me.
As always, thanks for the review!
|