linux/mm
Hugh Dickins 8f4f8c164c [PATCH] mm: unlink vma before pagetables
In most places the descent from pgd to pud to pmd to pte holds mmap_sem
(exclusively or not), which ensures that free_pgtables cannot be freeing page
tables from any level at the same time.  But truncation and reverse mapping
descend without mmap_sem.

No problem: just make sure that a vma is unlinked from its prio_tree (or
nonlinear list) and from its anon_vma list, after zapping the vma, but before
freeing its page tables.  Then neither vmtruncate nor rmap can reach that vma
whose page tables are now volatile (nor do they need to reach it, since all
its page entries have been zapped by this stage).

The i_mmap_lock and anon_vma->lock already serialize this correctly; but the
locking hierarchy is such that we cannot take them while holding
page_table_lock.  Well, we're trying to push that down anyway.  So in this
patch, move anon_vma_unlink and unlink_file_vma into free_pgtables, at the
same time as moving page_table_lock around calls to unmap_vmas.

tlb_gather_mmu and tlb_finish_mmu then fall outside the page_table_lock, but
we made them preempt_disable and preempt_enable earlier; and a long source
audit of all the architectures has shown no problem with removing
page_table_lock from them.  free_pgtables doesn't need page_table_lock for
itself, nor for what it calls; tlb->mm->nr_ptes is usually protected by
page_table_lock, but partly by non-exclusive mmap_sem - here it's decremented
with exclusive mmap_sem, or mm_users 0.  update_hiwater_rss and
vm_unacct_memory don't need page_table_lock either.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:40 -07:00
..
bootmem.c [PATCH] core remove PageReserved 2005-10-29 21:40:39 -07:00
fadvise.c [PATCH] xip: madvice/fadvice: execute in place 2005-06-24 00:06:42 -07:00
filemap_xip.c [PATCH] core remove PageReserved 2005-10-29 21:40:39 -07:00
filemap.c [PATCH] mm: page fault handlers tidyup 2005-10-29 21:40:37 -07:00
filemap.h [PATCH] xip: reduce code duplication 2005-06-24 00:06:41 -07:00
fremap.c [PATCH] mm: ptd_alloc take ptlock 2005-10-29 21:40:40 -07:00
highmem.c [PATCH] gfp_t: the rest 2005-10-28 08:16:51 -07:00
hugetlb.c [PATCH] mm: ptd_alloc take ptlock 2005-10-29 21:40:40 -07:00
internal.h Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
Kconfig [PATCH] fix mm/Kconfig spelling 2005-09-17 11:50:01 -07:00
madvise.c [PATCH] core remove PageReserved 2005-10-29 21:40:39 -07:00
Makefile [PATCH] xip: fs/mm: execute in place 2005-06-24 00:06:41 -07:00
memory.c [PATCH] mm: unlink vma before pagetables 2005-10-29 21:40:40 -07:00
mempolicy.c [PATCH] mm: pte_offset_map_lock loops 2005-10-29 21:40:40 -07:00
mempool.c [PATCH] gfp_t: mm/* (easy parts) 2005-10-28 08:16:47 -07:00
mincore.c [PATCH] freepgt: sys_mincore ignore FIRST_USER_PGD_NR 2005-04-19 13:29:20 -07:00
mlock.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
mmap.c [PATCH] mm: unlink vma before pagetables 2005-10-29 21:40:40 -07:00
mprotect.c [PATCH] mm: pte_offset_map_lock loops 2005-10-29 21:40:40 -07:00
mremap.c [PATCH] mm: ptd_alloc take ptlock 2005-10-29 21:40:40 -07:00
msync.c [PATCH] mm: pte_offset_map_lock loops 2005-10-29 21:40:40 -07:00
nommu.c [PATCH] mm: update_hiwaters just in time 2005-10-29 21:40:39 -07:00
oom_kill.c [PATCH] gfp flags annotations - part 1 2005-10-08 15:00:57 -07:00
page_alloc.c [PATCH] core remove PageReserved 2005-10-29 21:40:39 -07:00
page_io.c [PATCH] gfp flags annotations - part 1 2005-10-08 15:00:57 -07:00
page-writeback.c [PATCH] timer initialization cleanup: DEFINE_TIMER 2005-09-09 14:03:48 -07:00
pdflush.c [PATCH] Cleanup patch for process freezing 2005-06-25 17:10:13 -07:00
prio_tree.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
readahead.c [PATCH] readahead: reset cache_hit earlier 2005-09-07 16:57:25 -07:00
rmap.c [PATCH] mm: update_hiwaters just in time 2005-10-29 21:40:39 -07:00
shmem.c [PATCH] core remove PageReserved 2005-10-29 21:40:39 -07:00
slab.c [PATCH] slab: add additional debugging to detect slabs from the wrong node 2005-10-29 21:40:36 -07:00
sparse.c [PATCH] sparsemem extreme: hotplug preparation 2005-09-05 00:05:38 -07:00
swap_state.c [PATCH] gfp flags annotations - part 1 2005-10-08 15:00:57 -07:00
swap.c [PATCH] core remove PageReserved 2005-10-29 21:40:39 -07:00
swapfile.c [PATCH] mm: pte_offset_map_lock loops 2005-10-29 21:40:40 -07:00
thrash.c [PATCH] swaptoken tuning 2005-10-29 21:40:35 -07:00
tiny-shmem.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
truncate.c [PATCH] DocBook: fix some descriptions 2005-05-01 08:59:26 -07:00
vmalloc.c [PATCH] mm: init_mm without ptlock 2005-10-29 21:40:40 -07:00
vmscan.c [PATCH] shrink_list(): skip anon pages if not may_swap 2005-10-29 21:40:36 -07:00