|
List Info
Thread: MCA handler support for Xen/ia64
|
|
| MCA handler support for Xen/ia64 |

|
2006-07-12 09:37:06 |
Hi all,
This is a design memo of the MCA handler for Xen/ia64.
We hope many reviews and many comments.
1. Basic design
- The MCA/CMC/CPE handler of the Xen/ia64 makes use of
Linux code
as much as possible.
- The CMC/CPE interruption is injected to dom0 for logging.
This interruption is not injected to domU or domVTI.
- If the MCA interruption is a TLB check, the MCA handler
changes the MCA to a CMC interruption, and inject it to
dom0.
This interruption is not injected to domU or domVTi.
- If the MCA interruption is not a TLB check, the MCA
handler
does not try to recover, and Xen/ia64 reboot.
2. Detail design
2.1 Initialization of MCA handler
The processing sequence is basically as follows.
1) Clear the Rendez checkin flag for all cpus.
2) Register the rendezvous interrupt vector with SAL.
3) Register the wakeup interrupt vector with SAL.
4) Register the Xen/ia64 MCA handler with SAL.
5) Configure the CMCI/P vector and handler. Interrupts
for CMC are per-processor, so AP CMC interrupts are
setup in smp_callin() (smpboot.c).
6) Setup the MCA rendezvous interrupt vector.
7) Setup the MCA wakeup interrupt vector.
8) Setup the CPEI/P handler.
9) Initialize the areas set aside by the Xen/ia64 to
buffer the platform/processor error states for
MCA/CMC/CPE handling.
10) Read the MCA error record for logging (by Dom0) if
Xen has been rebooted due to an unrecoverable MCA.
2.2 MCA handler (TLB error only)
The processing sequence is basically as follows.
1) Get processor state parameter on existing PALE_CHECK.
And purge TR and TC, and reload TR.
2) Call the ia64_mca_handler().
3) Wait for checkin of slave processors.
4) Wakeup all the processors which are spinning in the
rendezvous loop.
5) Get the MCA error record.
And hold the MCA error record into Xen/ia64 for
logging
by dom0.
6) Clear the MCA error record.
7) Inject the external interruption of CMC to dom0.
8) Set IA64_MCA_CORRECTED to the ia64_sal_os_state
struct.
9) Return to the SAL and resume the interrupted
processing.
2.3 MCA handler (TLB error and the others error)
The processing sequence is basically as follows.
1) Get processor state parameter on existing PALE_CHECK.
And purge TR and TC, and reload TR.
2) Call the ia64_mca_handler().
3) Wait for checkin of slave processors.
4) Wakeup all the processors which are spinning in the
rendezvous loop.
5) Get the MCA error record.
And save the MCA error record into Xen/ia64 for
logging
by dom0 after reboot. [*1]
6) Return to the SAL and reboot the Xen/ia64.
2.4 MCA handler (Not TLB error)
The processing sequence is basically as follows.
1) Get processor state parameter on existing PALE_CHECK.
2) Call the ia64_mca_handler().
3) Wait for checkin of slave processors.
4) Wakeup all the processors which are spinning in the
rendezvous loop.
5) Get the MCA error record.
And save the MCA error record into Xen/ia64 for
logging
by dom0 after reboot. [*1]
6) Return to the SAL and reboot the Xen/ia64.
2.5 CMC handler
The processing sequence is basically as follows.
1) Call the ia64_mca_cmc_int_handler() from the
__do_IRQ() in the ia64_handle_irq().
2) Get the MCA error record.
And save the MCA error record into Xen/ia64 for
logging
by dom0 after reboot. [*1]
3) Inject the external interruption of CMC to dom0.
2.6 CPE handler
Same as CMC.
2.7 SAL emulation for Dom0/DomU/DomVTI
The following SAL emulation procedures are added.
- SAL_SET_VECTORS
- SAL_GET_STATE_INFO
- SAL_GET_STATE_INFO_SIZE
- SAL_CLEAR_STATE_INFO
- SAL_MC_SET_PARAMS
Note:
[*1]: Actually, read the MCA error record again after the
Xen/ia64 rebooted and log it with dom0.
Best regards,
Yutaka Ezaki
Masaki Kanno
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel lists.xensource.com
http://list
s.xensource.com/xen-ia64-devel
|
|
| MCA handler support for Xen/ia64 |

|
2006-07-13 04:59:30 |
Hi Masaki,
Thanks for the write-up, generally looks like a good
approach to me.
A few comments and questions:
How do you plan to handle the mismatch between dom0's
vCPUs and the
pCPUs reporting errors. For instance, will all pCPU's CMCs
be injected
into dom0 vCPU0? Will all CPE records be returned from all
pCPUs when
dom0 does a SAL_GET_STATE_INFO from vCPU0?
SAL_GET_STATE_INFO_SIZE may
need to return the platform state info size * number of
pCPUs to allow
dom0 enough space to save the records. On big SMP systems
we need to
make sure that's not more than can reasonable be allocated
in the kernel
by dom0.
What about clearing error records? We need to be careful
that error
records read by Xen and cleared before being passed to dom0
are volatile
and could be lost if the system crashes or if dom0 doesn't
retrieve
them. It's best to only clear the log after the error
record has been
received by dom0 and dom0 issues a SAL_CLEAR_STATE_INFO.
This will get
complicated if we need to clear error records on all pCPUs
in response
to a SAL_CLEAR_STATE_INFO on dom0 vCPU0.
Do you plan to support CMC and CPE throttling in Xen (ie.
switching
between interrupt driven and polling handlers under load)
and dynamic
polling intervals?
It may be overly complicated to support CPEI on dom0
(fake MADT
entries, trapping IOSAPIC write, maybe an entirely virtual
IOSAPIC in
order to describe a valid GSI for the CPEI, etc...).
Probably best to
start out with just letting dom0 poll for CPE records.
Thanks,
Alex
--
Alex Williamson HP Open Source
& Linux Org.
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel lists.xensource.com
http://list
s.xensource.com/xen-ia64-devel
|
|
| MCA handler support for Xen/ia64 |

|
2006-07-13 11:35:14 |
Hi Alex,
Thanks for your comments.
We noticed there were problems by your comments,
and they must be solved. Please give us a few days
for thinking.
Best regards,
Kan
>Hi Masaki,
>
> Thanks for the write-up, generally looks like a good
approach to me.
>A few comments and questions:
>
> How do you plan to handle the mismatch between
dom0's vCPUs and the
>pCPUs reporting errors. For instance, will all pCPU's
CMCs be injected
>into dom0 vCPU0? Will all CPE records be returned from
all pCPUs when
>dom0 does a SAL_GET_STATE_INFO from vCPU0?
SAL_GET_STATE_INFO_SIZE may
>need to return the platform state info size * number of
pCPUs to allow
>dom0 enough space to save the records. On big SMP
systems we need to
>make sure that's not more than can reasonable be
allocated in the kernel
>by dom0.
>
> What about clearing error records? We need to be
careful that error
>records read by Xen and cleared before being passed to
dom0 are volatile
>and could be lost if the system crashes or if dom0
doesn't retrieve
>them. It's best to only clear the log after the error
record has been
>received by dom0 and dom0 issues a SAL_CLEAR_STATE_INFO.
This will get
>complicated if we need to clear error records on all
pCPUs in response
>to a SAL_CLEAR_STATE_INFO on dom0 vCPU0.
>
> Do you plan to support CMC and CPE throttling in Xen
(ie. switching
>between interrupt driven and polling handlers under
load) and dynamic
>polling intervals?
>
> It may be overly complicated to support CPEI on dom0
(fake MADT
>entries, trapping IOSAPIC write, maybe an entirely
virtual IOSAPIC in
>order to describe a valid GSI for the CPEI, etc...).
Probably best to
>start out with just letting dom0 poll for CPE records.
Thanks,
>
> Alex
>
>--
>Alex Williamson HP Open
Source & Linux Org.
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel lists.xensource.com
http://list
s.xensource.com/xen-ia64-devel
|
|
| MCA handler support for Xen/ia64 |

|
2006-07-28 12:12:15 |
Hi Alex,
We are awfully sorry to have kept you waiting for a long
time.
>Hi Masaki,
>
> Thanks for the write-up, generally looks like a good
approach to me.
>A few comments and questions:
>
> How do you plan to handle the mismatch between
dom0's vCPUs and the
>pCPUs reporting errors. For instance, will all pCPU's
CMCs be injected
>into dom0 vCPU0? Will all CPE records be returned from
all pCPUs when
>dom0 does a SAL_GET_STATE_INFO from vCPU0?
SAL_GET_STATE_INFO_SIZE may
>need to return the platform state info size * number of
pCPUs to allow
>dom0 enough space to save the records. On big SMP
systems we need to
>make sure that's not more than can reasonable be
allocated in the kernel
>by dom0.
>
Our design is to inject all CMC/CPEs into dom0 vcpu0. I
think this is
sufficient because our goal of this initial support is
logging of
hardware error, not recovery. See detailed flow below.
Step1: Xen receives CMC/CPE interrupt(1)(2) from each
pCPUs, and
queues(3)(4) these interrupts.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| +----------------------------------+ |
+--------------------------------------+
| +-vCPU0-+ |Xen
| | | +-------+ +-------+ |
| | ----> pCPU0 ---> pCPU1 | |
| | | +-------+ +-------+ |
| +-------+ |status | |status | |
| +-------+ +-------+ |
| A A |
| +-CMC/CPE handler-|----------|-----+ |
| | |(3) |(4) | |
| | queues interupts | |
| | with a handling state | |
| | A A | |
| +---------|----------------|-------+ |
+-----------|----------------|---------+
| +-pCPU0-+ | +-pCPU1-+ | |Hardware
| | (1)---+ | (2)---+ |
| +-+-----+ +-+-----+ |
| | | |
| +-+-----+ +-+-----+ |
| |record0| |record1| |
| +-------+ +-------+ |
+--------------------------------------+
Step2: Inject(5) a CMC/CPE into dom0 vCPU0 in turn.
Then dom0 issues(6) SAL_GET_STATE_INFO.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| | SAL_GET_STATE_INFO | |
| | (6) | |
| | | A | |
| +--|------|------------------------+ |
+--(trap)---|--------------------------+
| | | |Xen
| V | |
| +-vCPU0-+ | |
| | (5)---+ |
| | | +-------+ +-------+ |
| | ----> pCPU0 ---> pCPU1 | |
| | | +-------+ +-------+ |
| +-------+ |status | |status | |
| +-------+ +-------+ |
+--------------------------------------+
| +-pCPU0-+ +-pCPU1-+ |Hardware
| | | | | |
| +-+-----+ +-+-----+ |
| | | |
| +-+-----+ +-+-----+ |
| |record0| |record1| |
| +-------+ +-------+ |
+--------------------------------------+
Step3: Xen traps this SAL call.
If the pCPU to get SAL record is the same as the
vCPU,
then Xen issues(7) a normal SAL call to the pCPU.
Xen copies(8) SAL record to dom0.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| | (8) Get SAL record | |
| | A | |
| +---------|------------------------+ |
+-----------|--------------------------+
| +-vCPU0-+ | |Xen
| | | | |
| | | | +-------+ +-------+ |
| | ---|--> pCPU0 ---> pCPU1 | |
| | | | +-------+ +-------+ |
| +-------+ | |status | |status | |
| | +-------+ +-------+ |
| | |
| SAL_GET_STATE_INFO |
| (7) | |
| | [Buffer] |
| | A |
+----|------|--------------------------+
| V | |Hardware
| +-pCPU0-+ | +-pCPU1-+ |
| | | | | | |
| +-+-----+ | +-+-----+ |
| | | | |
| +-+-----+ | +-+-----+ |
| |record0+-+ |record1| |
| +-------+ +-------+ |
+--------------------------------------+
Step4: Dom0 issues(9) SAL_CLEAR_STATE_INFO.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| | SAL_CLEAR_STATE_INFO | |
| | (9) | |
| | | | |
| +--|-------------------------------+ |
+--(trap)------------------------------+
| | |Xen
| V |
| +-vCPU0-+ |
| | | +-------+ +-------+ |
| | ----> pCPU0 ---> pCPU1 | |
| | | +-------+ +-------+ |
| +-------+ |status | |status | |
| +-------+ +-------+ |
+--------------------------------------+
| +-pCPU0-+ +-pCPU1-+ |Hardware
| | | | | |
| +-+-----+ +-+-----+ |
| | | |
| +-+-----+ +-+-----+ |
| |record0| |record1| |
| +-------+ +-------+ |
+--------------------------------------+
Step5: Xen traps this SAL call.
If the pCPU to clear SAL record is the same as the
vCPU,
then Xen issues(10) a normal SAL call to the pCPU.
Xen frees(11) pCPU0 information.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| +----------------------------------+ |
+--------------------------------------+
| +-vCPU0-+ |Xen
| | | |
| | | +-------+ |
| | -----------------> pCPU1 | |
| | | (11) +-------+ |
| +-------+ |status | |
| +-------+ |
| SAL_CLEAR_STATE_INFO |
| (10) |
+----|---------------------------------+
| V |Hardware
| +-pCPU0-+ +-pCPU1-+ |
| | | | | |
| +-------+ +-+-----+ |
| | |
| +-+-----+ |
| |record1| |
| +-------+ |
+--------------------------------------+
Step6: Inject(12) the next CMC/CPE into dom0 vCPU0.
Then dom0 issues(13) SAL_GET_STATE_INFO.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| | SAL_GET_STATE_INFO | |
| | (13) | |
| | | A | |
| +--|------|------------------------+ |
+--(trap)---|--------------------------+
| | | |Xen
| V | |
| +-vCPU0-+ | |
| | (12)---+ |
| | | +-------+ |
| | ---------------> pCPU1 | |
| | | +-------+ |
| +-------+ |status | |
| +-------+ |
+--------------------------------------+
| +-pCPU0-+ +-pCPU1-+ |Hardware
| | | | | |
| +-------+ +-+-----+ |
| | |
| +-+-----+ |
| |record1| |
| +-------+ |
+--------------------------------------+
Step7: Xen traps this SAL call.
If the pCPU to get SAL record is not the same as
the
vCPU, Xen issues(14) IPI for another pCPU, Xen on
another pCPU issues(15) SAL call.
Xen copies(16) SAL record to dom0.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| | (16) Get SAL record | |
| | A | |
| +---------|------------------------+ |
+-----------|--------------------------+
| +-vCPU0-+ | |Xen
| | | | |
| | | | +-------+ |
| | ---|-------------> pCPU1 | |
| | | | +-------+ |
| +-------+ | |status | |
| | +-------+ |
| | |
| | SAL_GET_STATE_INFO |
| send IPI | (15) |
| (14) | A | |
| | [Buffer] | | |
| | A | | |
| | | | | |
+----|------|---------|--|-------------+
| | +---------------------+ |Hardware
| | | | | |
| V | V | |
| +-pCPU0-+ +-pCPU1-+ | |
| | |------->| | | |
| +-------+ +-+-----+ | |
| | | |
| +-+-----+ | |
| |record1+------+ |
| +-------+ |
+--------------------------------------+
Step8: Dom0 issues(17) SAL_CLEAR_STATE_INFO.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| | SAL_CLEAR_STATE_INFO | |
| | (17) | |
| | | | |
| +--|-------------------------------+ |
+--(trap)------------------------------+
| | |Xen
| V |
| +-vCPU0-+ |
| | | +-------+ |
| | ---------------> pCPU1 | |
| | | +-------+ |
| +-------+ |status | |
| +-------+ |
+--------------------------------------+
| +-pCPU0-+ +-pCPU1-+ |Hardware
| | | | | |
| +-------+ +-+-----+ |
| | |
| +-+-----+ |
| |record1| |
| +-------+ |
+--------------------------------------+
Step9: Xen traps this SAL call.
If the pCPU to clear SAL record is not the same as
the
vCPU, Xen issues(18) IPI for another pCPU, Xen on
another pCPU issues(19) SAL call.
Xen frees(20) pCPU1 information.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| +----------------------------------+ |
+--------------------------------------+
| +-vCPU0-+ |Xen
| | | |
| | | (20) |
| | | |
| | | |
| +-------+ |
| SAL_CLEAR_STATE_INFO |
| send IPI (19) |
| (18) A | |
+----|----------------|--|-------------+
| | | | |Hardware
| V | V |
| +-pCPU0-+ +-pCPU1-+ |
| | |------->| | |
| +-------+ +-------+ |
+--------------------------------------+
> What about clearing error records? We need to be
careful that error
>records read by Xen and cleared before being passed to
dom0 are volatile
>and could be lost if the system crashes or if dom0
doesn't retrieve
>them. It's best to only clear the log after the error
record has been
>received by dom0 and dom0 issues a SAL_CLEAR_STATE_INFO.
This will get
>complicated if we need to clear error records on all
pCPUs in response
>to a SAL_CLEAR_STATE_INFO on dom0 vCPU0.
>
By our new design, Xen issues SAL_CLEAR_STATE_INFO
synchronizing with
SAL_CLEAR_STATE_INFO that dom0 issues.
> Do you plan to support CMC and CPE throttling in Xen
(ie. switching
>between interrupt driven and polling handlers under
load) and dynamic
>polling intervals?
>
Yes, our design is supported CMC and CPE throttling in Xen
and dynamic
polling intervals. We think that Xen must not fall or slow
down with
hot CMC and CPE interruption.
> It may be overly complicated to support CPEI on dom0
(fake MADT
>entries, trapping IOSAPIC write, maybe an entirely
virtual IOSAPIC in
>order to describe a valid GSI for the CPEI, etc...).
Probably best to
>start out with just letting dom0 poll for CPE records.
Thanks,
>
Thanks for your advice. As for MADT and IOSAPIC, we are not
well
informed. We hope for advice from you and everyone.
Your advice modifies Linux/kernel(mca.c) of dom0, doesn't
it? If so,
we modify Linux/kernel of dom0, and CPE supports polling
mode only.
BTW, new member kaz has join our team.
> Alex
>
>--
>Alex Williamson HP Open
Source & Linux Org.
Best regards,
Yutaka Ezaki(You)
Kazuhiro Suzuki(Kaz)
Masaki Kanno(Kan)
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel lists.xensource.com
http://list
s.xensource.com/xen-ia64-devel
|
|
| MCA handler support for Xen/ia64 |

|
2006-07-28 14:31:40 |
On Fri, 2006-07-28 at 21:12 +0900, Masaki Kanno wrote:
> Hi Alex,
>
> We are awfully sorry to have kept you waiting for a
long time.
Hi Kan,
No problem, thanks for your thorough investigation into
this.
> Our design is to inject all CMC/CPEs into dom0 vcpu0. I
think this is
> sufficient because our goal of this initial support is
logging of
> hardware error, not recovery. See detailed flow below.
This looks good to me. Queuing and tracking the
interrupts could get
complicated, but I can't think of a better way to do it
without going
back the the previous design of storing the error records in
Xen. Also
note that not all platforms support CPE interrupt, so you
may need to
invent a slightly different flow for that case. I would
assume in this
case that Xen would poll each CPU for SAL_GET_STATE_INFO.
If it get
back an error log it adds that pCPU to a queue, and the next
time dom0
calls SAL_GET_STATE_INFO it gets directed to the correct
pCPU to re-read
the error log and clear it (much like your existing
interrupt model).
> > What about clearing error records?
>
> By our new design, Xen issues SAL_CLEAR_STATE_INFO
synchronizing with
> SAL_CLEAR_STATE_INFO that dom0 issues.
I like this approach better.
> > Do you plan to support CMC and CPE throttling in
Xen
>
> Yes, our design is supported CMC and CPE throttling in
Xen and dynamic
> polling intervals. We think that Xen must not fall or
slow down with
> hot CMC and CPE interruption.
Great!
> > It may be overly complicated to support CPEI on
dom0
>
> Thanks for your advice. As for MADT and IOSAPIC, we are
not well
> informed. We hope for advice from you and everyone.
> Your advice modifies Linux/kernel(mca.c) of dom0,
doesn't it? If so,
> we modify Linux/kernel of dom0, and CPE supports
polling mode only.
I would start out with the easier case of letting dom0
poll for CPE
records. This should require no change to dom0 MCA code,
just make sure
a CPEI vector is not reported to dom0 via the ACPI MADT.
We can then later investigate optimizations to make this
more
efficient. If we do something like a virtual IOSAPIC to
deliver the CPE
interrupt, there shouldn't be any changes necessary to the
dom0 MCA
code. We just need to see how hard this would be (it may be
easy).
> BTW, new member kaz has join our team.
Welcome Kaz! Thanks,
Alex
--
Alex Williamson HP Open Source
& Linux Org.
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel lists.xensource.com
http://list
s.xensource.com/xen-ia64-devel
|
|
| MCA handler support for Xen/ia64 |

|
2006-07-31 08:30:07 |
Hi Alex,
Thanks for your comments and informations.
We start to make patch.
Best regards,
You, Kaz, and Kan
>On Fri, 2006-07-28 at 21:12 +0900, Masaki Kanno wrote:
>> Hi Alex,
>>
>> We are awfully sorry to have kept you waiting for a
long time.
>
>Hi Kan,
>
> No problem, thanks for your thorough investigation
into this.
>
>> Our design is to inject all CMC/CPEs into dom0
vcpu0. I think this is
>> sufficient because our goal of this initial support
is logging of
>> hardware error, not recovery. See detailed flow
below.
>
> This looks good to me. Queuing and tracking the
interrupts could get
>complicated, but I can't think of a better way to do it
without going
>back the the previous design of storing the error
records in Xen. Also
>note that not all platforms support CPE interrupt, so
you may need to
>invent a slightly different flow for that case. I would
assume in this
>case that Xen would poll each CPU for
SAL_GET_STATE_INFO. If it get
>back an error log it adds that pCPU to a queue, and the
next time dom0
>calls SAL_GET_STATE_INFO it gets directed to the correct
pCPU to re-read
>the error log and clear it (much like your existing
interrupt model).
>
>> > What about clearing error records?
>>
>> By our new design, Xen issues SAL_CLEAR_STATE_INFO
synchronizing with
>> SAL_CLEAR_STATE_INFO that dom0 issues.
>
> I like this approach better.
>
>> > Do you plan to support CMC and CPE
throttling in Xen
>>
>> Yes, our design is supported CMC and CPE throttling
in Xen and dynamic
>> polling intervals. We think that Xen must not fall
or slow down with
>> hot CMC and CPE interruption.
>
> Great!
>
>> > It may be overly complicated to support CPEI
on dom0
>>
>> Thanks for your advice. As for MADT and IOSAPIC, we
are not well
>> informed. We hope for advice from you and everyone.
>> Your advice modifies Linux/kernel(mca.c) of dom0,
doesn't it? If so,
>> we modify Linux/kernel of dom0, and CPE supports
polling mode only.
>
> I would start out with the easier case of letting
dom0 poll for CPE
>records. This should require no change to dom0 MCA
code, just make sure
>a CPEI vector is not reported to dom0 via the ACPI MADT.
>
> We can then later investigate optimizations to make
this more
>efficient. If we do something like a virtual IOSAPIC to
deliver the CPE
>interrupt, there shouldn't be any changes necessary to
the dom0 MCA
>code. We just need to see how hard this would be (it
may be easy).
>
>> BTW, new member kaz has join our team.
>
> Welcome Kaz! Thanks,
>
> Alex
>
>--
>Alex Williamson HP Open
Source & Linux Org.
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel lists.xensource.com
http://list
s.xensource.com/xen-ia64-devel
|
|
[1-6]
|
|