All,
I'm happy to announce that CAM is now MPSAFE, thanks to the
help of many
people and sponsorship by Yahoo! The work is in FreeBSD CVS
now and can
be obtained by checking out the HEAD/7-CURRENT branch. It
will be part
of the upcoming FreeBSD 7.0 release this year. Only the AHC
and AHD
drivers are MPSAFE at the moment, but hopefully more will
follow in the
coming months. Below is a document describing the locking
approach, and
instructions for locking CAM/SIM drivers that are not yet
MPSAFE.
Locking theory
--------------
The following describes the basics of the locking strategy
in CAM itself
and how that applies to the SIM drivers (SCSI hardware
drivers)
underneath it. While CAM is MPSAFE, only a few SIMs have
been made
MPSAFE so far. The rest are mostly unchanged and are
allowed continue
to operate just as they did before. I hope that other
developers and
interested users will step in and help make these drivers
MPSAFE, as
it's too much work for me alone.
Being MPSAFE doesn't necessarily make the CAM subsystem
itself faster.
The locking is still fairly monolithic on a per SIM instance
level, and
there isn't much parallelism for operations within each
instance.
Multiple SIM instances, i.e. multiple buses, do operate
almost
completely independently of each other now, so there is full
parallelism
there. However, being MPSAFE does eliminate contention with
the other
parts of the OS that are still under Giant, and this is
still a huge
win. Testing moderate to heavy loads on multi-core systems
has shown
a significant decrease in contention on the Giant lock,
while showing
only minimal new contention on the CAM locks. This lowered
contention
translates into less system time wasted by the CPUs, and
thus more
cycles for useful work as well as less latency.
There are now 4 basic locks in CAM, 3 of which are:
xpt_lock - Protects the XPT softc, periph, and SIM
instances
xpt_topo_lock - Protects the global peripheral and bus
lists
cam_simq_lock - Protects the list of SIMs to be processed in
the camisr
These 3 locks are internal to the CAM core and have little
bearing on
the operation of SIMs. None of these locks will be held
when calling
into a SIM, and the SIM has no need to access to them
either.
The 4th lock is the SIM lock. This is a non-recursive sleep
mutex
(MTX_DEF) that the SIM instance uses to protect its internal
data
structures and operations. It is also exported up to CAM
when calling
cam_sim_alloc(), and is used by CAM to protect target,
device, and
peripheral objects, as well as SIM and device queues. Every
entry from
CAM into the SIM will be done with this lock held. The SIM
is welcome
to unlock it when it needs, but it must be held when calling
back into
most CAM functions. It is the primary lock for normal I/O
flow
throughout CAM starting at the top of the stack in the
periph driver.
The flow looks like this:
periph_strategy sim->mtx
| |
xpt_schedule |
| |
periph_start |
| |
xpt_action |
| |
sim_action +
On completion:
sim_isr sim->mtx
| |
xpt_done |cam_simq_lock
| |
swi_sched +
camisr cam_simq_lock
|
camisr_runqueue sim->mtx
| |
periph_done +
A SIM that is not MPSAFE exports the the Giant mutex
(&Giant) in
cam_sim_alloc(). Giant is then treated as a normal mutex by
CAM and
is locked and unlocked in the same place as for MPSAFE SIMs.
This does
not put all of CAM back under Giant; multiple SIMs instances
can be
registered, some MPSAFE and some not, and CAM will treat the
locking of
each instance separately.
Driver changes
--------------
For non-MPSAFE drivers, a single change was made to the API
in the
cam_sim_alloc() function. The function now looks like
this:
struct cam_sim * cam_sim_alloc(sim_action_func sim_action,
sim_poll_func sim_poll,
const char *sim_name,
void *softc,
u_int32_t unit,
struct mtx *mtx,
int max_dev_transactions,
int
max_tagged_dev_transactions,
struct cam_devq *queue);
For the "mtx" argument, "&Giant" is
used. Everything else in the
SIM stays the same. Some structures have also changed
sizes, most
notable "cam_sim", but that is not an issue since
source level
compatibility is already affected.
MPSAFE drivers must do the following things:
1. Provide a pointer to a MTX_DEF mutex in cam_sim_alloc().
The mutex
must be allocated and initialized before calling
cam_sim_alloc(), and
must not be destroyed until after calling cam_sim_free().
It should not
be held while calling cam_sim_alloc().
2. The timeout_ch field in the ccb_hdr structure is no
longer available
for use by the SIM. SIMs must now allocate, initialize, and
manage
their own callout structures. All uses of the timeout() API
must be
switched to the callout() API. See the callout manpage for
details on
this.
3. Add the INTR_MPSAFE flag to bus_setup_intr(). This will
prevent
Giant from being automatically acquired before the driver
interrupt
handler is called.
4. Any busdma tags that allow load deferrals (i.e. return
EINPROGRESS)
must register a non-Giant mutex in bus_dma_tag_create().
This field is
not inherited from parent tags.
5. If the driver registers a character device with
make_dev(), the
D_NEEDSGIANT flag should be dropped, and appropriate locking
added to
the device entry vectors.
6. If the driver registers any sysctls, all locks must be
dropped and
Giant must be held explicitly when registering and
deregistering the
sysctl nodes. Sysctl handlers will be called with Giant
held, and
appropriate locking should be added under that. No calls
into CAM
should be made from these contexts.
7. Provide appropriate locking in the interrupt handler as
well as any
taskqueue handlers, callout handlers, kthreads, or other
detached
contexts, as appropriate.
8. Ensure that the registered SIM mutex is held when
calling all CAM
entry points. Until recently, the xpt_done() entry point
provided its
own locking and did not require Giant to be held. It still
does not
require Giant, but it does require the SIM lock to be held
when calling
it.
9. Do not hold the SIM mutex or any other mutex when
calling
malloc(M_WAITOK), bus_dmamem_alloc(), and
bus_dmamap_create().
10. Any uses of tsleep must be changed to msleep.
For multi-function PCI devices where each function
represents a bus, a
separate SIM and SIM mutex should be allocated and managed
for each
function. Functions that register multiple SIMs should
coordinate
locking between those SIMs as needed; the same lock can be
registered
for these separate SIMs, at the cost of reduced parallelism
between
SIMs. Functions that register a single SIM for multiple
buses will have
all of those buses under a single mutex as far as CAM is
concerned.
The simplest strategy is to use a single lock per SIM
instance. More
complex multi-level or pipelined locking is allowed; the
registered SIM
lock can be dropped by the SIM at any point without
disrupting the rest
of CAM, so long as no CAM entry points are called with it
unlocked.
This will be an area for further research.
Userland changes
----------------
Efforts were made to keep the userland API and ABI
unchanged. Thus,
there are no source level changes needed for any tools,
libraries, or
apps, nor any need to recompile any of these either.
Future work
-----------
The CAM API will likely undergo some more small changes to
support
future work with newbus integration and SAS/SATA/FC
transport
modularization. These changes will hopefully be done before
FreeBSD 7.0
is released.
_______________________________________________
freebsd-scsi freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-scsi
To unsubscribe, send any mail to
"freebsd-scsi-unsubscribe freebsd.org"
|