List Info

Thread: flush icache on ia64 take2




flush icache on ia64 take2
user name
2007-07-05 21:29:01
This is a patch for fixing icache flush race in
ia64(Montecito) by implementing
flush_icache_page() at el.

Changelog:
 - updated against 2.6.22-rc7 (previous one was against
2.6.21)
 - removed hugetlbe's lazy_mmu_prot_update().
 - rewrote patch description.
 - removed patch against mprotect() if flushes cache.

Brief Description:
In current kernel, ia64 flushes executable page's icache by
lazy_mmu_prot_update() after set_pte(). But multi-threaded
programs can access
text page if pte says "a page is present". Then,
there is a race condition.
This patch guarantees cache is flushed before set_pte()
call.

Basic Information:
Montecito, new ia64 processor, has separated L2 I-cache and
L2 D-cache,
and I-cache and D-cache is not guaranteed to be consistent
in automatic way.

L1 cache is also separated but L1 D-cache is write-through.


Montecito has separated L2 cache and Mixed L3 cache.
And...L2 D-cache is
*write back*. (See ht
tp://download.intel.com/design/Itanium2/manuals/
30806501.pdf section 2.3.3)

What we found and my understanding:

In our environment, I found SIGILL occurs when...
 * A multi threaded program is executed. (a program is a HPC
program and uses
   automatic parallelizaiton.)
 * Above program is on NFS.
 * SIGILL happes only when a file is newly loaded into the
page-cache.
 * Where failure occurs is not fixed. SIGILL comes at random
instruction point.
 * This didn't happen before Montecito.

>From Itanium2 document, we can see....
 * Any invalidation of a cache-line will invalidate an
I-cache.
 * I-cache and D-cache is not coherent.
 * L1D-cache to L2 cache is *write-through*.
 * L2D-cache to L3-mixed cache is *write-back*.

And we don't see this problem before Montecito. Big
difference between old ones
and Montecito is 
 * old cpus has Mixed L2 cache.
 * Montecito has separated L2I-cache and L2D-cache.

Following is my understanding. I'd like to hear ia64
specialist's opinion.

Assume CPU(A) and CPU(B).

 1. CPU(A) causes a page fault in text and calls
do_no_page().
 2. CPU(B) executes NFS's ops and fill page with received
RPC result(from NFS).
    and do SetPageUptodate().
 3. CPU(A) continues a page fault operation and calls
set_pte().
 4. CPU(B)'s another thread executes a text in the page
which CPU(A) mapped.
 5-a. CPU(A) calls lazy_mmu_prot_update() and flush icache
(this is slow).
 5-b. CPU(B) continues execution and cause SIGILL by an
instruction which
      CPU(A) haven't flushed yet.

 In stage 2. , CPU(B) loads all text data into L2-Dcache and
L3-mixed cache. 
 And write them.
 But data which can be accessed by CPU(B)'s L2-Icache is in
L3-mixed cache.
 Then, If write back from L2-Dcache to L3-mixed cache is
delayed, L2-Icache
 of CPU(B) will fetch wrong data.
 Note: CPU(A) will fetch fetch correct instruction in above
case.

 Usual file systems uses DMA and it purges cache. But NFS
uses copy-by-cpu.

 Anyway, there is SIGILL problem in NFS/ia64 and icache
flush can fix
 SIGILL problem (in our HPC team test.)

 calling lazy_mmu_prot_update() before set_pte() can fix
this. But
 it seems strange way. Then, flush_icache_page() is
implemented.
 
 Note1: icache flush is called only when VM_EXEC flag is on
and 
        PG_arch_1 is not set.
 Note2: description in Devid Mosberger's "ia64 linux
kernel" pp204 says
       "linux taditionally maps the memory stack and
memory allocated by
	the brk() with executable permission turned on...."
 	But this is changed now. anon/stack is not mapped as
executable usually.
	It depens on READ_IMPLIES_EXEC personality. So checking
VM_EXEC is enough.

What this patch does:
 1. remove all lazy_mmu_prot_update()...which is used by
only ia64.
 2. implements flush_cache_page()/flush_icache_page() for
ia64.


Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyujp.fujitsu.com>

---
 arch/ia64/mm/init.c           |    7 +------
 include/asm-generic/pgtable.h |    4 ----
 include/asm-ia64/cacheflush.h |   24
++++++++++++++++++++++--
 include/asm-ia64/pgtable.h    |    9 ---------
 mm/fremap.c                   |    1 -
 mm/hugetlb.c                  |    5 +++--
 mm/memory.c                   |   13 ++++++-------
 mm/migrate.c                  |    6 +++++-
 mm/mprotect.c                 |    1 -
 mm/rmap.c                     |    1 -
 10 files changed, 37 insertions(+), 34 deletions(-)

Index: devel-2.6.22-rc7/include/asm-ia64/cacheflush.h
============================================================
=======
--- devel-2.6.22-rc7.orig/include/asm-ia64/cacheflush.h
+++ devel-2.6.22-rc7/include/asm-ia64/cacheflush.h
 -10,18
+10,38 
 
 #include <asm/bitops.h>
 #include <asm/page.h>
+#include <linux/mm.h>
 
 /*
  * Cache flushing routines.  This is the kind of stuff that
can be very expensive, so try
  * to avoid them whenever possible.
  */
+extern void __flush_icache_page_ia64(struct page *page);
 
 #define flush_cache_all()			do { } while (0)
 #define flush_cache_mm(mm)			do { } while (0)
 #define flush_cache_dup_mm(mm)			do { } while (0)
 #define flush_cache_range(vma, start, end)	do { } while
(0)
-#define flush_cache_page(vma, vmaddr, pfn)	do { } while
(0)
-#define flush_icache_page(vma,page)		do { } while (0)
+
+static inline void
+flush_cache_page(struct vm_area_struct *vma, unsigned long
vmaddr,
+		unsigned long pfn)
+{
+	if (vma->vm_flags & VM_EXEC) {
+		struct page *page;
+		if (!pfn_valid(pfn))
+			return;
+		page = pfn_to_page(pfn);
+		__flush_icache_page_ia64(page);
+	}
+}
+
+static inline void
+flush_icache_page(struct vm_area_struct *vma,struct page
*page) {
+	if (vma->vm_flags & VM_EXEC)
+		__flush_icache_page_ia64(page);
+}
+
 #define flush_cache_vmap(start, end)		do { } while (0)
 #define flush_cache_vunmap(start, end)		do { } while (0)
 
Index: devel-2.6.22-rc7/arch/ia64/mm/init.c
============================================================
=======
--- devel-2.6.22-rc7.orig/arch/ia64/mm/init.c
+++ devel-2.6.22-rc7/arch/ia64/mm/init.c
 -54,16
+54,11  struct page *zero_page_memmap_ptr;	/* ma
 EXPORT_SYMBOL(zero_page_memmap_ptr);
 
 void
-lazy_mmu_prot_update (pte_t pte)
+__flush_icache_page_ia64 (struct page *page)
 {
 	unsigned long addr;
-	struct page *page;
 	unsigned long order;
 
-	if (!pte_exec(pte))
-		return;				/* not an executable page... */
-
-	page = pte_page(pte);
 	addr = (unsigned long) page_address(page);
 
 	if (test_bit(PG_arch_1, &page->flags))
Index: devel-2.6.22-rc7/include/asm-ia64/pgtable.h
============================================================
=======
--- devel-2.6.22-rc7.orig/include/asm-ia64/pgtable.h
+++ devel-2.6.22-rc7/include/asm-ia64/pgtable.h
 -151,7
+151,6 
 
 #include <linux/sched.h>	/* for mm_struct */
 #include <asm/bitops.h>
-#include <asm/cacheflush.h>
 #include <asm/mmu_context.h>
 #include <asm/processor.h>
 
 -502,13
+501,6  extern struct page *zero_page_memmap_ptr
 #define HUGETLB_PGDIR_MASK	(~(HUGETLB_PGDIR_SIZE-1))
 #endif
 
-/*
- * IA-64 doesn't have any external MMU info: the page
tables contain all the necessary
- * information.  However, we use this routine to take care
of any (delayed) i-cache
- * flushing that may be necessary.
- */
-extern void lazy_mmu_prot_update (pte_t pte);
-
 #define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
 /*
  * Update PTEP with ENTRY, which is guaranteed to be a
less
 -596,7
+588,6  extern void lazy_mmu_prot_update (pte_t 
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
 #define __HAVE_ARCH_PTE_SAME
 #define __HAVE_ARCH_PGD_OFFSET_GATE
-#define __HAVE_ARCH_LAZY_MMU_PROT_UPDATE
 
 #ifndef CONFIG_PGTABLE_4
 #include <asm-generic/pgtable-nopud.h>
Index: devel-2.6.22-rc7/include/asm-generic/pgtable.h
============================================================
=======
--- devel-2.6.22-rc7.orig/include/asm-generic/pgtable.h
+++ devel-2.6.22-rc7/include/asm-generic/pgtable.h
 -168,10
+168,6  static inline void ptep_set_wrprotect(st
 #define pgd_offset_gate(mm, addr)	pgd_offset(mm, addr)
 #endif
 
-#ifndef __HAVE_ARCH_LAZY_MMU_PROT_UPDATE
-#define lazy_mmu_prot_update(pte)	do { } while (0)
-#endif
-
 #ifndef __HAVE_ARCH_MOVE_PTE
 #define move_pte(pte, prot, old_addr, new_addr)	(pte)
 #endif
Index: devel-2.6.22-rc7/mm/memory.c
============================================================
=======
--- devel-2.6.22-rc7.orig/mm/memory.c
+++ devel-2.6.22-rc7/mm/memory.c
 -1693,7
+1693,6  static int do_wp_page(struct mm_struct *
 		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
 		if (ptep_set_access_flags(vma, address, page_table,
entry,1)) {
 			update_mmu_cache(vma, address, entry);
-			lazy_mmu_prot_update(entry);
 		}
 		ret |= VM_FAULT_WRITE;
 		goto unlock;
 -1735,7
+1734,6  gotten:
 		flush_cache_page(vma, address, pte_pfn(orig_pte));
 		entry = mk_pte(new_page, vma->vm_page_prot);
 		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
-		lazy_mmu_prot_update(entry);
 		/*
 		 * Clear the pte entry and flush it first, before
updating the
 		 * pte with the new entry. This will avoid a race
condition
 -2200,7
+2198,6  static int do_swap_page(struct mm_struct
 
 	/* No need to invalidate - it was non-present before */
 	update_mmu_cache(vma, address, pte);
-	lazy_mmu_prot_update(pte);
 unlock:
 	pte_unmap_unlock(page_table, ptl);
 out:
 -2257,12
+2254,16  static int do_anonymous_page(struct mm_s
 		inc_mm_counter(mm, file_rss);
 		page_add_file_rmap(page);
 	}
-
+	/*
+	 * new page is zero-filled, but we have to guarantee
icache-dcache
+	 * synchronization before setting pte on some processor.
+	 */
+	if (write_access)
+		flush_icache_page(vma, page);
 	set_pte_at(mm, address, page_table, entry);
 
 	/* No need to invalidate - it was non-present before */
 	update_mmu_cache(vma, address, entry);
-	lazy_mmu_prot_update(entry);
 unlock:
 	pte_unmap_unlock(page_table, ptl);
 	return VM_FAULT_MINOR;
 -2407,7
+2408,6  retry:
 
 	/* no need to invalidate: a not-present page shouldn't be
cached */
 	update_mmu_cache(vma, address, entry);
-	lazy_mmu_prot_update(entry);
 unlock:
 	pte_unmap_unlock(page_table, ptl);
 	if (dirty_page) {
 -2563,7
+2563,6  static inline int handle_pte_fault(struc
 	entry = pte_mkyoung(entry);
 	if (ptep_set_access_flags(vma, address, pte, entry,
write_access)) {
 		update_mmu_cache(vma, address, entry);
-		lazy_mmu_prot_update(entry);
 	} else {
 		/*
 		 * This is needed only for protection faults but the arch
code
Index: devel-2.6.22-rc7/mm/migrate.c
============================================================
=======
--- devel-2.6.22-rc7.orig/mm/migrate.c
+++ devel-2.6.22-rc7/mm/migrate.c
 -172,6
+172,11  static void remove_migration_pte(struct 
 	pte = pte_mkold(mk_pte(new, vma->vm_page_prot));
 	if (is_write_migration_entry(entry))
 		pte = pte_mkwrite(pte);
+	/*
+	 * If the processor doesn't guarantee icache-dicache
synchronization.
+	 * We need to flush icache before set_pte.
+	 */
+	flush_icache_page(vma, new);
 	set_pte_at(mm, addr, ptep, pte);
 
 	if (PageAnon(new))
 -181,7
+186,6  static void remove_migration_pte(struct 
 
 	/* No need to invalidate - it was non-present before */
 	update_mmu_cache(vma, addr, pte);
-	lazy_mmu_prot_update(pte);
 
 out:
 	pte_unmap_unlock(ptep, ptl);
Index: devel-2.6.22-rc7/mm/mprotect.c
============================================================
=======
--- devel-2.6.22-rc7.orig/mm/mprotect.c
+++ devel-2.6.22-rc7/mm/mprotect.c
 -53,7
+53,6  static void change_pte_range(struct mm_s
 			if (dirty_accountable && pte_dirty(ptent))
 				ptent = pte_mkwrite(ptent);
 			set_pte_at(mm, addr, pte, ptent);
-			lazy_mmu_prot_update(ptent);
 #ifdef CONFIG_MIGRATION
 		} else if (!pte_file(oldpte)) {
 			swp_entry_t entry = pte_to_swp_entry(oldpte);
Index: devel-2.6.22-rc7/mm/rmap.c
============================================================
=======
--- devel-2.6.22-rc7.orig/mm/rmap.c
+++ devel-2.6.22-rc7/mm/rmap.c
 -436,7
+436,6  static int page_mkclean_one(struct page 
 		entry = pte_wrprotect(entry);
 		entry = pte_mkclean(entry);
 		set_pte_at(mm, address, pte, entry);
-		lazy_mmu_prot_update(entry);
 		ret = 1;
 	}
 
Index: devel-2.6.22-rc7/mm/fremap.c
============================================================
=======
--- devel-2.6.22-rc7.orig/mm/fremap.c
+++ devel-2.6.22-rc7/mm/fremap.c
 -83,7
+83,6  int install_page(struct mm_struct *mm, s
 	set_pte_at(mm, addr, pte, pte_val);
 	page_add_file_rmap(page);
 	update_mmu_cache(vma, addr, pte_val);
-	lazy_mmu_prot_update(pte_val);
 	err = 0;
 unlock:
 	pte_unmap_unlock(pte, ptl);
Index: devel-2.6.22-rc7/mm/hugetlb.c
============================================================
=======
--- devel-2.6.22-rc7.orig/mm/hugetlb.c
+++ devel-2.6.22-rc7/mm/hugetlb.c
 -328,7
+328,6  static void set_huge_ptep_writable(struc
 	entry = pte_mkwrite(pte_mkdirty(*ptep));
 	if (ptep_set_access_flags(vma, address, ptep, entry, 1))
{
 		update_mmu_cache(vma, address, entry);
-		lazy_mmu_prot_update(entry);
 	}
 }
 
 -445,6
+444,7  static int hugetlb_cow(struct mm_struct 
 	 * and just make the page writable */
 	avoidcopy = (page_count(old_page) == 1);
 	if (avoidcopy) {
+		/* cache flush is not necessary in this case */
 		set_huge_ptep_writable(vma, address, ptep);
 		return VM_FAULT_MINOR;
 	}
 -463,6
+463,8  static int hugetlb_cow(struct mm_struct 
 
 	ptep = huge_pte_offset(mm, address & HPAGE_MASK);
 	if (likely(pte_same(*ptep, pte))) {
+		/* We have to confirm new_page's cache coherency if
VM_EXEC */
+		flush_icache_page(vma,new_page);
 		/* Break COW */
 		set_huge_pte_at(mm, address, ptep,
 				make_huge_pte(vma, new_page, 1));
 -681,7
+683,6  void hugetlb_change_protection(struct vm
 			pte = huge_ptep_get_and_clear(mm, address, ptep);
 			pte = pte_mkhuge(pte_modify(pte, newprot));
 			set_huge_pte_at(mm, address, ptep, pte);
-			lazy_mmu_prot_update(pte);
 		}
 	}
 	spin_unlock(&mm->page_table_lock);

-
To unsubscribe from this list: send the line
"unsubscribe linux-ia64" in
the body of a message to majordomovger.kernel.org
More majordomo info at  http://vge
r.kernel.org/majordomo-info.html

Re: flush icache on ia64 take2
country flaguser name
France
2007-07-06 04:41:38
KAMEZAWA Hiroyuki wrote:

>  Note1: icache flush is called only when VM_EXEC flag
is on and 
>         PG_arch_1 is not set.

If you have not got the page in the cache, then the new page
will
be allocated with PG_arch_1 bit off.
You are going to flush pages which are read by HW DMA, i.e.
the L2I
of Montecito does not keep old lines for those pages
anyway.

...->a_ops->readpage() of "L2I safe" file
systems should set PG_arch_1
if the CPU is ia64 and it has got separate L2I.

On the other hand, arch. independent file systems should not
play with
PG_arch_1.
The base kernel should export a macro for the file
systems...

Thanks,

Zoltan Menyhart


-
To unsubscribe from this list: send the line
"unsubscribe linux-ia64" in
the body of a message to majordomovger.kernel.org
More majordomo info at  http://vge
r.kernel.org/majordomo-info.html

Re: flush icache on ia64 take2
user name
2007-07-06 04:50:06
On Fri, 06 Jul 2007 11:41:38 +0200
Zoltan Menyhart <Zoltan.Menyhartbull.net> wrote:

> KAMEZAWA Hiroyuki wrote:
> 
> >  Note1: icache flush is called only when VM_EXEC
flag is on and 
> >         PG_arch_1 is not set.
> 
> If you have not got the page in the cache, then the new
page will
> be allocated with PG_arch_1 bit off.
> You are going to flush pages which are read by HW DMA,
i.e. the L2I
> of Montecito does not keep old lines for those pages
anyway.
> 
yes, it's my concern. 

> ...->a_ops->readpage() of "L2I safe"
file systems should set PG_arch_1
> if the CPU is ia64 and it has got separate L2I.
> 
> On the other hand, arch. independent file systems
should not play with
> PG_arch_1.
> The base kernel should export a macro for the file
systems...
> 
Hm, but it should be discussed in another thread... I
think.
>From memory management view, flush_icache before
set_pte() is sane thing.
 
-Kame

-
To unsubscribe from this list: send the line
"unsubscribe linux-ia64" in
the body of a message to majordomovger.kernel.org
More majordomo info at  http://vge
r.kernel.org/majordomo-info.html

Re: flush icache on ia64 take2
user name
2007-07-19 01:56:32
On Fri, 6 Jul 2007 11:29:01 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyujp.fujitsu.com> wrote:

> This is a patch for fixing icache flush race in
ia64(Montecito) by implementing
> flush_icache_page() at el.
> 
> Changelog:
>  - updated against 2.6.22-rc7 (previous one was against
2.6.21)
>  - removed hugetlbe's lazy_mmu_prot_update().
>  - rewrote patch description.
>  - removed patch against mprotect() if flushes cache.
> 
Then, what should I do more for fixing this SIGILL problem
?

-Kame

-
To unsubscribe from this list: send the line
"unsubscribe linux-ia64" in
the body of a message to majordomovger.kernel.org
More majordomo info at  http://vge
r.kernel.org/majordomo-info.html

Re: flush icache on ia64 take2
country flaguser name
France
2007-07-19 07:05:06
KAMEZAWA Hiroyuki wrote:

> Then, what should I do more for fixing this SIGILL
problem ?
> 
> -Kame

I can think of a relatively cheap solution:

New pages are allocated with the bit PG_arch_1 off, see
page_cache_read() ... prep_new_page(), i.e. the I-cache is
not coherent with the D-cache.

page_cache_read() should add a macro, say:

	ARCH_PREP_PAGE_BEFORE_READ(page);

before actually calling mapping->a_ops->readpage(file,
page).

This macro can be for ia64 something like:

do { if (CPU_has_split_L2_I_cache) set_bit(PG_arch_1,
&page->flags); }

and empty for the the architectures non concerned.

The file systems which are identified not to use HW tools
to
avoid split I-cache incoherency, e.g. nfs_readpage(), are
required
to add a macro, say:

	ARCH_CHECK_READ_PAGE_COHERENCY(page);

This macro can be for ia64:

do { if (CPU_has_split_L2_I_cache) clear_bit(PG_arch_1,
&page->flags); }

and empty for the the architectures non concerned.

Back to do_no_page():
if the new PTE includes the exec bit and PG_arch_1 is set,
the page has to be flushed from the I-cache before the PTE
is
made globally visible.

File systems using local block devices with DMA are
considered
to be safe, with the exceptions of the bounce buffers.

When you copy into the destination page, another macro
should be
added, say:

	ARCH_CHECK_BOUNCE_READ_COHERENCY(bio_vec);

#define ARCH_CHECK_BOUNCE_READ_COHERENCY(bio_vec) 
		ARCH_CHECK_READ_PAGE_COHERENCY(bio_vec->bv_page)

Remote DMA based network file systems, e.g. Lustre on
Quadrics,
Infiniband are also considered to be safe.

Thanks,

Zoltan
-
To unsubscribe from this list: send the line
"unsubscribe linux-ia64" in
the body of a message to majordomovger.kernel.org
More majordomo info at  http://vge
r.kernel.org/majordomo-info.html

Re: flush icache on ia64 take2
country flaguser name
France
2007-07-19 07:15:49
> Back to do_no_page():
> if the new PTE includes the exec bit and PG_arch_1 is
set,
> the page has to be flushed from the I-cache before the
PTE is
> made globally visible.

Sorry, I wanted to say:

if the new PTE includes the exec bit and PG_arch_1 is NOT
set

Thanks,

Zoltan

-
To unsubscribe from this list: send the line
"unsubscribe linux-ia64" in
the body of a message to majordomovger.kernel.org
More majordomo info at  http://vge
r.kernel.org/majordomo-info.html

Re: flush icache on ia64 take2
user name
2007-07-19 08:01:18
On Thu, 19 Jul 2007 14:05:06 +0200
Zoltan Menyhart <Zoltan.Menyhartbull.net> wrote:

> KAMEZAWA Hiroyuki wrote:
> 
> > Then, what should I do more for fixing this SIGILL
problem ?
> > 
> > -Kame
> 
> I can think of a relatively cheap solution:
> 
Maybe I should take performance numbers with the patch.

But is it too costly that flushing icache page only if a
page is newly
installed into the system (PG_arch1) && it is mapped
as executable ?

I don't want to leak this (stupid) corner case to the file
system layer.
Hmm...can't we do clever flushing (like your idea) in VM
layer ?

-Kame
> New pages are allocated with the bit PG_arch_1 off,
see
> page_cache_read() ... prep_new_page(), i.e. the I-cache
is
> not coherent with the D-cache.
> 
> page_cache_read() should add a macro, say:
> 
> 	ARCH_PREP_PAGE_BEFORE_READ(page);
> 
> before actually calling
mapping->a_ops->readpage(file, page).
> 
> This macro can be for ia64 something like:
> 
> do { if (CPU_has_split_L2_I_cache) set_bit(PG_arch_1,
&page->flags); }
> 
> and empty for the the architectures non concerned.
> 
> The file systems which are identified not to use HW
tools to
> avoid split I-cache incoherency, e.g. nfs_readpage(),
are required
> to add a macro, say:
> 
> 	ARCH_CHECK_READ_PAGE_COHERENCY(page);
> 
> This macro can be for ia64:
> 
> do { if (CPU_has_split_L2_I_cache) clear_bit(PG_arch_1,
&page->flags); }
> 
> and empty for the the architectures non concerned.
> 
> Back to do_no_page():
> if the new PTE includes the exec bit and PG_arch_1 is
set,
> the page has to be flushed from the I-cache before the
PTE is
> made globally visible.
> 
> File systems using local block devices with DMA are
considered
> to be safe, with the exceptions of the bounce buffers.
> 
> When you copy into the destination page, another macro
should be
> added, say:
> 
> 	ARCH_CHECK_BOUNCE_READ_COHERENCY(bio_vec);
> 
> #define ARCH_CHECK_BOUNCE_READ_COHERENCY(bio_vec) 
> 		ARCH_CHECK_READ_PAGE_COHERENCY(bio_vec->bv_page)
> 
> Remote DMA based network file systems, e.g. Lustre on
Quadrics,
> Infiniband are also considered to be safe.
> 
> Thanks,
> 
> Zoltan
> 
-
To unsubscribe from this list: send the line
"unsubscribe linux-ia64" in
the body of a message to majordomovger.kernel.org
More majordomo info at  http://vge
r.kernel.org/majordomo-info.html

Re: flush icache on ia64 take2
user name
2007-07-19 08:32:08
On Thu, 19 Jul 2007 22:01:18 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyujp.fujitsu.com> wrote:

> On Thu, 19 Jul 2007 14:05:06 +0200
> Zoltan Menyhart <Zoltan.Menyhartbull.net> wrote:
> 
> > KAMEZAWA Hiroyuki wrote:
> > 
> > > Then, what should I do more for fixing this
SIGILL problem ?
> > > 
> > > -Kame
> > 
> > I can think of a relatively cheap solution:
> > 
> Maybe I should take performance numbers with the
patch.
> 
> But is it too costly that flushing icache page only if
a page is newly
> installed into the system (PG_arch1) && it is
mapped as executable ?
> 
> I don't want to leak this (stupid) corner case to the
file system layer.
> Hmm...can't we do clever flushing (like your idea) in
VM layer ?
> 
A bit new idea.  How about this ?
==
- Set PG_arch_1 if  "icache is *not* coherent"
- make flush_dcache_page() to be empty func.
- For Montecito, add kmap_atomic(). This function just set
PG_arch1.
  Then, "the page which is copied by the kernel"
is marked as "not icache coherent page"
- icache_flush_page() just flushes a page which has
PG_arch_1.
- Anonymous page is always has PG_arch_1. Tkae care of
Copy-On-Write.
==

looks easy ?

-Kame

 
-
To unsubscribe from this list: send the line
"unsubscribe linux-ia64" in
the body of a message to majordomovger.kernel.org
More majordomo info at  http://vge
r.kernel.org/majordomo-info.html

Re: flush icache on ia64 take2
user name
2007-07-19 09:51:57
On Thu, 19 Jul 2007 16:15:03 +0200
Zoltan Menyhart <Zoltan.Menyhartbull.net> wrote:

> We may have, say 1 Gbyte / sec local i/o activity
(using some RAIDs).
> Assume a few % of this 1 Gbyte is the program
execution, or program swap in.
> It gives some hundreds of new exec pages / sec =>
> some msec-s can be lost each sec.
> 
> I can agree that it should not be a big deal 
> 
Hmm...but the current code flushes the page. just do it in
"lazy" way.
much difference ?

-Kame
-
To unsubscribe from this list: send the line
"unsubscribe linux-ia64" in
the body of a message to majordomovger.kernel.org
More majordomo info at  http://vge
r.kernel.org/majordomo-info.html

Re: flush icache on ia64 take2
user name
2007-07-19 20:51:23
On Thu, 19 Jul 2007 17:18:59 +0200
Zoltan Menyhart <Zoltan.Menyhartbull.net> wrote:

> KAMEZAWA Hiroyuki wrote:
> 
> > Hmm...but the current code flushes the page. just
do it in "lazy" way.
> > much difference ?
> 
> I agree the current code flushes the I-cache for all
kinds of file
> systems (for PTEs with the exec bit on).
> 
> The error is that it does it after the PTE is written.
> 
> In addition, I wanted to optimize it to gain a few %.
I'm glad, too if we can do it. But it seems that we need a
bit
clever way to do that.

> Apparently this idea is not much welcome.
> 
> I can agree that flushing the I-cache (if the
architecture requires it)
> before setting the PTE eliminates the error.
> 
Hm, I'll refresh and repost the patch. 

Thanks,
-Kame

-
To unsubscribe from this list: send the line
"unsubscribe linux-ia64" in
the body of a message to majordomovger.kernel.org
More majordomo info at  http://vge
r.kernel.org/majordomo-info.html

[1-10]

about | contact  Other archives ( Real Estate discussion Medical topics )