List Info

Thread: Re: MI SONIC Ethernet driver for mac68k




Re: MI SONIC Ethernet driver for mac68k
user name
2007-06-05 09:15:43
haukeEspresso.Rhein-Neckar.DE wrote:

> >> Do you have a performance comparison for the
old vs. the MI one?
> >
> >Unfortunately, MI one is slower (currently).
> 
> Can you time the transfers from the other (I assume,
non-mac68k) machine
> for comparison?

The other side is NetBSD/i386 (Athlon64) connected via
re(4)
and a Gig switch.

I've tried the similar tests with more recent (today)
sources
with my esp(4) fix, then the MI one gets a bit better
result
than before while it's still slower than old MD one on TX:
---

with old MD driver:

on mac68k side:
---
 :
root file system type: ffs
Enter pathname of shell or RETURN for /bin/sh: 
We recommend creating a non-root account and using su(1) for
root access.
No entry for terminal type "dumb";
using dumb terminal settings.
# mount -a -t nonfs
# ifconfig sn0 192.168.20.35
# dmesg|grep sn0
sn0 at obio0: integrated Ethernet adapter
sn0: Ethernet address 08:00:07:9f:07:c6
# ./ttcp -rs
ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001 
tcp
ttcp-r: socket
ttcp-r: accept from 192.168.20.1
ttcp-r: 16777216 bytes in 19.33 real seconds = 847.75 KB/sec
+++
ttcp-r: 2049 I/O calls, msec/call = 9.66, calls/sec =
106.02
ttcp-r: 0.0user 19.2sys 0:19real 99% 0i+0d 0maxrss 0+2pf
0+0csw
# ./ttcp -ts 192.168.20.1
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5001 
tcp  -> 192.168.20.1
ttcp-t: socket
ttcp-t: connect
ttcp-t: 16777216 bytes in 15.93 real seconds = 1028.54
KB/sec +++
ttcp-t: 2048 I/O calls, msec/call = 7.96, calls/sec =
128.57
ttcp-t: 0.1user 15.4sys 0:15real 97% 0i+0d 0maxrss 0+4098pf
0+0csw
# 
---

on i386 side:
---
% dmesg|grep cpu0
cpu0 at mainbus0 apid 0: (boot processor)
cpu0: AMD Athlon 64 or Sempron (686-class), 2210.86 MHz, id
0x40ff2
 :
cpu0: "AMD Athlon(tm) 64 Processor 3500+"
 :
% uname -mrs
NetBSD 4.99.20 i386
% ttcp -ts 192.168.20.35
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5001 
tcp  -> 192.168.20.35
ttcp-t: socket
ttcp-t: connect
ttcp-t: 16777216 bytes in 19.36 real seconds = 846.22 KB/sec
+++
ttcp-t: 2048 I/O calls, msec/call = 9.68, calls/sec =
105.78
ttcp-t: -1.9user 0.0sys 0:19real 0% 0i+0d 0maxrss 0+4098pf
0+0csw
% ttcp -rs
ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001 
tcp
ttcp-r: socket
ttcp-r: accept from 192.168.20.35
ttcp-r: 16777216 bytes in 15.97 real seconds = 1025.70
KB/sec +++
ttcp-r: 11586 I/O calls, msec/call = 1.41, calls/sec =
725.33
ttcp-r: 0.0user 0.0sys 0:15real 0% 0i+0d 0maxrss 0+2pf
0+0csw
% 
---


with MI driver:

on mac68k side:
---
 :
using dumb terminal settings.
# mount -a -t nonfs
# ifconfig sn0 192.168.20.35
# dmesg|grep sn0
sn0 at obio0: integrated SONIC Ethernet adapter
sn0: Ethernet address 08:00:07:9f:07:c6
# ./ttcp -rs
ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001 
tcp
ttcp-r: socket
ttcp-r: accept from 192.168.20.1
ttcp-r: 16777216 bytes in 19.14 real seconds = 855.99 KB/sec
+++
ttcp-r: 2049 I/O calls, msec/call = 9.57, calls/sec =
107.05
ttcp-r: 0.0user 19.0sys 0:19real 99% 0i+0d 0maxrss 0+2pf
0+0csw
# ./ttcp -ts 192.168.20.1
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5001 
tcp  -> 192.168.20.1
ttcp-t: socket
ttcp-t: connect
ttcp-t: 16777216 bytes in 20.61 real seconds = 794.98 KB/sec
+++
ttcp-t: 2048 I/O calls, msec/call = 10.30, calls/sec =
99.37
ttcp-t: 0.1user 20.4sys 0:20real 99% 0i+0d 0maxrss 0+4098pf
0+0csw
# 
---

on i386 side:
---
% ttcp -ts 192.168.20.35
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5001 
tcp  -> 192.168.20.35
ttcp-t: socket
ttcp-t: connect
ttcp-t: 16777216 bytes in 19.18 real seconds = 854.25 KB/sec
+++
ttcp-t: 2048 I/O calls, msec/call = 9.59, calls/sec =
106.78
ttcp-t: -1.9user 0.0sys 0:19real 0% 0i+0d 0maxrss 0+4098pf
0+0csw
% ttcp -rs
ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001 
tcp
ttcp-r: socket
ttcp-r: accept from 192.168.20.35
ttcp-r: 16777216 bytes in 20.65 real seconds = 793.25 KB/sec
+++
ttcp-r: 12273 I/O calls, msec/call = 1.72, calls/sec =
594.21
ttcp-r: 0.0user 0.0sys 0:20real 0% 0i+0d 0maxrss 0+2pf
0+0csw
% 
---

Summary:
	TX on sn0	RX on sn0
MD:	1026KB/s	 846KB/s
MI:  	 793KB/s	 854KB/s

- RX looks mostly the same.
  Maybe I forgot to update <m68k/types.h> then MI
dp83932.c might
  do extra copies due to lack of __NO_STRICT_ALIGNMENT, and
  the bottleneck is in some upper layer?

- TX is still slower on MI driver.
  Maybe MI dp83932.c tries to set up too many DMA
descriptors
  to send fragmented mbufs directly, and cache flush ops
  against such descriptors are more expensive than copying
mbufs
  to uncached contiguous buffer?
  (if so, adding BUS_DMA_COHERENT support may improve
performance)

> Since, as I understand, the MD driver does
buffer-to-memory
> transfers by cpu, it may well lock out timer interrupts
and lose clock
> ticks, possibly skewing your timing results.

Actually I see esp(4) driver on mac68k has such problem
(softclock seems blocked too much according to vmstat -i),
but MD mac68k/dev/if_sn.c doesn't have splhigh() at all
so I don't think it causes tick loss.
---
Izumi Tsutsui

Re: MI SONIC Ethernet driver for mac68k
user name
2007-06-06 11:54:57
I wrote:

> Summary:
> 	TX on sn0	RX on sn0
> MD:	1026KB/s	 846KB/s
> MI:  	 793KB/s	 854KB/s

more results:

MI driver with BUS_DMA_COHERENT support:
	TX on sn0	RX on sn0
	 842KB/s	 888KB/s

MI driver with BUS_DMA_COHERENT support and 16bytes TX DMA
threshold:
	TX on sn0	RX on sn0
	 903KB/s	 886KB/s

---
Izumi Tsutsui


Index: arch/m68k/include/bus_dma.h
============================================================
=======
RCS file: /cvsroot/src/sys/arch/m68k/include/bus_dma.h,v
retrieving revision 1.8
diff -u -r1.8 bus_dma.h
--- arch/m68k/include/bus_dma.h	4 Mar 2007 06:00:04
-0000	1.8
+++ arch/m68k/include/bus_dma.h	6 Jun 2007 16:48:19 -0000
 -119,6
+119,7 
 struct m68k_bus_dma_segment {
 	bus_addr_t	ds_addr;	/* DMA address */
 	bus_size_t	ds_len;		/* length of transfer */
+	u_int		_ds_flags;	/* MD flags */
 };
 typedef struct m68k_bus_dma_segment	bus_dma_segment_t;
 
 -215,7
+216,7 
 	int		_dm_segcnt;	/* number of segs this map can map */
 	bus_size_t	_dm_maxmaxsegsz; /* fixed largest possible
segment */
 	bus_size_t	_dm_boundary;	/* don't cross this */
-	int		_dm_flags;	/* misc. flags */
+	u_int		_dm_flags;	/* misc. flags */
 
 	/* Machine dependant fields: */
 	bus_size_t  dm_xfer_len;	/* length of successful transfer
*/
Index: arch/m68k/include/pmap_motorola.h
============================================================
=======
RCS file:
/cvsroot/src/sys/arch/m68k/include/pmap_motorola.h,v
retrieving revision 1.13
diff -u -r1.13 pmap_motorola.h
--- arch/m68k/include/pmap_motorola.h	12 May 2007 17:43:53
-0000	1.13
+++ arch/m68k/include/pmap_motorola.h	6 Jun 2007 16:48:19
-0000
 -202,10
+202,8 
 #define	PMAP_PREFER(foff, vap, sz, td)	pmap_prefer((foff),
(vap))
 #endif
 
-#ifdef mvme68k
 void	_pmap_set_page_cacheable(struct pmap *, vaddr_t);
 void	_pmap_set_page_cacheinhibit(struct pmap *, vaddr_t);
 int	_pmap_page_is_cacheable(struct pmap *, vaddr_t);
-#endif
 
 #endif /* !_M68K_PMAP_MOTOROLA_H_ */
Index: arch/m68k/m68k/bus_dma.c
============================================================
=======
RCS file: /cvsroot/src/sys/arch/m68k/m68k/bus_dma.c,v
retrieving revision 1.23
diff -u -r1.23 bus_dma.c
--- arch/m68k/m68k/bus_dma.c	2 Jun 2007 11:13:45 -0000	1.23
+++ arch/m68k/m68k/bus_dma.c	6 Jun 2007 16:48:19 -0000
 -141,23
+141,30 
 	bus_size_t sgsize;
 	bus_addr_t curaddr, lastaddr, baddr, bmask;
 	vaddr_t vaddr = (vaddr_t)buf;
-	int seg;
+	int seg, cacheable, coherent;
+	pmap_t pmap;
 	bool rv;
 
+	coherent = BUS_DMA_COHERENT;
 	lastaddr = *lastaddrp;
 	bmask = ~(map->_dm_boundary - 1);
+	if (!VMSPACE_IS_KERNEL_P(vm))
+		pmap = vm_map_pmap(&vm->vm_map);
+	else
+		pmap = pmap_kernel();
 
 	for (seg = *segp; buflen > 0 ; ) {
 		/*
 		 * Get the physical address for this segment.
 		 */
-		if (!VMSPACE_IS_KERNEL_P(vm))
-			rv = pmap_extract(vm_map_pmap(&vm->vm_map),
-			    vaddr, &curaddr);
-		else
-			rv = pmap_extract(pmap_kernel(), vaddr, &curaddr);
+		rv = pmap_extract(pmap, vaddr, &curaddr);
 		KASSERT(rv);
 
+		cacheable = _pmap_page_is_cacheable(pmap, vaddr);
+
+		if (cacheable)
+			coherent = 0;
+
 		/*
 		 * Compute the segment size, and adjust counts.
 		 */
 -181,6
+188,8 
 		if (first) {
 			map->dm_segs[seg].ds_addr = curaddr;
 			map->dm_segs[seg].ds_len = sgsize;
+			map->dm_segs[seg]._ds_flags =
+			    cacheable ? 0 : BUS_DMA_COHERENT;
 			first = 0;
 		} else {
 			if (curaddr == lastaddr &&
 -195,6
+204,8 
 					break;
 				map->dm_segs[seg].ds_addr = curaddr;
 				map->dm_segs[seg].ds_len = sgsize;
+				map->dm_segs[seg]._ds_flags =
+				    cacheable ? 0 : BUS_DMA_COHERENT;
 			}
 		}
 
 -205,6
+216,9 
 
 	*segp = seg;
 	*lastaddrp = lastaddr;
+	map->_dm_flags &= ~BUS_DMA_COHERENT;
+	/* BUS_DMA_COHERENT is set only if all segments are
uncached */
+	map->_dm_flags |= coherent;
 
 	/*
 	 * Did we fit?
 -408,6
+422,7 
 	map->dm_maxsegsz = map->_dm_maxmaxsegsz;
 	map->dm_mapsize = 0;
 	map->dm_nsegs = 0;
+	map->_dm_flags &= ~BUS_DMA_COHERENT;
 }
 
 /*
 -426,6
+441,7 
 #if defined(M68040) || defined(M68060)
 	bus_addr_t p, e, ps, pe;
 	bus_size_t seglen;
+	bus_dma_segment_t *seg;
 	int i;
 #endif
 
 -438,6
+454,10 
 #endif
 
 #if defined(M68040) || defined(M68060)
+	/* If the whole DMA map is uncached, do nothing. */
+	if ((map->_dm_flags & BUS_DMA_COHERENT) != 0)
+		return;
+
 	/* Short-circuit for unsupported `ops' */
 	if ((ops & (BUS_DMASYNC_PREREAD |
BUS_DMASYNC_PREWRITE)) == 0)
 		return;
 -446,9
+466,10 
 	 * flush/purge the cache.
 	 */
 	for (i = 0; i < map->dm_nsegs && len != 0;
i++) {
-		if (map->dm_segs[i].ds_len <= offset) {
+		seg = &map->dm_segs[i];
+		if (seg->ds_len <= offset) {
 			/* Segment irrelevant - before requested offset */
-			offset -= map->dm_segs[i].ds_len;
+			offset -= seg->ds_len;
 			continue;
 		}
 
 -457,11
+478,15 
 		 * each segment until we have exhausted the
 		 * length.
 		 */
-		seglen = map->dm_segs[i].ds_len - offset;
+		seglen = seg->ds_len - offset;
 		if (seglen > len)
 			seglen = len;
 
-		ps = map->dm_segs[i].ds_addr + offset;
+		/* Ignore cache-inhibited segments */
+		if ((seg->_ds_flags & BUS_DMA_COHERENT) != 0)
+			continue;
+
+		ps = seg->ds_addr + offset;
 		pe = ps + seglen;
 
 		if (ops & BUS_DMASYNC_PREWRITE) {
 -655,10
+680,20 
 			pmap_enter(pmap_kernel(), va, addr,
 			    VM_PROT_READ | VM_PROT_WRITE,
 			    VM_PROT_READ | VM_PROT_WRITE | PMAP_WIRED);
+
+			/* Cache-inhibit the page if necessary */
+			if ((flags & BUS_DMA_COHERENT) != 0)
+				_pmap_set_page_cacheinhibit(pmap_kernel(), va);
+
+			segs[curseg]._ds_flags &= ~BUS_DMA_COHERENT;
+			segs[curseg]._ds_flags |= (flags &
BUS_DMA_COHERENT);
 		}
 	}
 	pmap_update(pmap_kernel());
 
+	if ((flags & BUS_DMA_COHERENT) != 0)
+		TBIAS();
+
 	return 0;
 }
 
 -669,6
+704,8 
 void
 _bus_dmamem_unmap(bus_dma_tag_t t, void *kva, size_t size)
 {
+	vaddr_t va;
+	size_t s;
 
 #ifdef DIAGNOSTIC
 	if ((u_long)kva & PGOFSET)
 -677,6
+714,15 
 
 	size = round_page(size);
 
+	/*
+	 * Re-enable cacheing on the range
+	 * XXXSCW: There should be some way to indicate that the
pages
+	 * were mapped DMA_MAP_COHERENT in the first place...
+	 */
+	for (s = 0, va = (vaddr_t)kva; s < size;
+	    s += PAGE_SIZE, va += PAGE_SIZE)
+		_pmap_set_page_cacheable(pmap_kernel(), va);
+
 	pmap_remove(pmap_kernel(), (vaddr_t)kva, (vaddr_t)kva +
size);
 	pmap_update(pmap_kernel());
 	uvm_km_free(kernel_map, (vaddr_t)kva, size,
UVM_KMF_VAONLY);
 -707,6
+753,10 
 			continue;
 		}
 
+		/*
+		 * XXXSCW: What about BUS_DMA_COHERENT ??
+		 */
+
 		return m68k_btop((char *)segs[i].ds_addr + off);
 	}
 
Index: arch/m68k/m68k/pmap_motorola.c
============================================================
=======
RCS file: /cvsroot/src/sys/arch/m68k/m68k/pmap_motorola.c,v
retrieving revision 1.30
diff -u -r1.30 pmap_motorola.c
--- arch/m68k/m68k/pmap_motorola.c	18 May 2007 01:46:40
-0000	1.30
+++ arch/m68k/m68k/pmap_motorola.c	6 Jun 2007 16:48:20
-0000
 -2848,8
+2848,6 
 	(void)cachectl1(0x80000004, va, len, p);
 }
 
-#ifdef mvme68k
-
 void
 _pmap_set_page_cacheable(pmap_t pmap, vaddr_t va)
 {
 -2905,8
+2903,6 
 	return (pmap_pte_ci(pmap_pte(pmap, va)) == 0) ? 1 : 0;
 }
 
-#endif /* mvme68k */
-
 #ifdef DEBUG
 /*
  * pmap_pvdump:

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )