linux

mirror of https://github.com/FEX-Emu/linux.git synced 2025-01-12 04:19:08 +00:00

Author	SHA1	Message	Date
Paul E. McKenney	2ec1f2d987	rcu: Increase rcutorture test coverage Currently, rcutorture has separate torture_types to test synchronous, asynchronous, and expedited grace-period primitives. This has two disadvantages: (1) Three times the number of runs to cover the combinations and (2) Little testing of concurrent combinations of the three options. This commit therefore adds a pair of module parameters that control normal and expedited state, with the default being both types, randomly selected, by the fakewriter processes, thus reducing source-code size and increasing test coverage. In addtion, the writer task switches between asynchronous-normal and expedited grace-period primitives driven by the same pair of module parameters. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-08-20 11:38:41 -07:00
Paul E. McKenney	d2818df168	rcu: Add duplicate-callback tests to rcutorture This commit adds a object_debug option to rcutorture to allow the debug-object-based checks for duplicate call_rcu() invocations to be deterministically tested. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> [ paulmck: Banish mid-function ifdef, more or less per Josh Triplett. ] Reviewed-by: Josh Triplett <josh@joshtriplett.org> [ paulmck: Improve duplicate-callback test, per Lai Jiangshan. ]	2013-08-20 11:37:54 -07:00
Yacine Belkadi	d185af300f	workqueue: fix some scripts/kernel-doc warnings When building the htmldocs (in verbose mode), scripts/kernel-doc reports the following type of warnings: Warning(kernel/workqueue.c:653): No description found for return value of 'get_work_pool' Fix them by: - Using "Return:" sections to introduce descriptions of return values - Adding some missing descriptions Signed-off-by: Yacine Belkadi <yacine.belkadi.1@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2013-08-20 12:57:25 +02:00
Chen Gang	f4940ab7c5	kernel/params.c: use scnprintf() instead of sprintf() For some strings (e.g. version string), they are permitted to be larger than PAGE_SIZE (although meaningless), so recommend to use scnprintf() instead of sprintf(). Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2013-08-20 15:37:46 +09:30
Chen Gang	cc56ded3fd	kernel/module.c: use scnprintf() instead of sprintf() For some strings, they are permitted to be larger than PAGE_SIZE, so need use scnprintf() instead of sprintf(), or it will cause issue. One case is: if a module version is crazy defined (length more than PAGE_SIZE), 'modinfo' command is still OK (print full contents), but for "cat /sys/modules/'modname'/version", will cause issue in kernel. Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2013-08-20 15:37:46 +09:30
Steven Rostedt	0ce814096f	module: Add NOARG flag for ops with param_set_bool_enable_only() set function The ops that uses param_set_bool_enable_only() as its set function can easily handle being used without an argument. There's no reason to fail the loading of the module if it does not have one. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2013-08-20 15:37:43 +09:30
Steven Rostedt	ab013c5f60	module: Add flag to allow mod params to have no arguments Currently the params.c code allows only two "set" functions to have no arguments. If a parameter does not have an argument, then it looks at the set function and tests if it is either param_set_bool() or param_set_bint(). If it is not one of these functions, then it fails the loading of the module. But there may be module parameters that have different set functions and still allow no arguments. But unless each of these cases adds their function to the if statement, it wont be allowed to have no arguments. This method gets rather messing and does not scale. Instead, introduce a flags field to the kernel_param_ops, where if the flag KERNEL_PARAM_FL_NOARG is set, the parameter will not fail if it does not contain an argument. It will be expected that the corresponding set function can handle a NULL pointer as "val". Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2013-08-20 15:37:42 +09:30
Christoph Jaeger	79ac6834c2	module: fix sprintf format specifier in param_get_byte() In param_get_byte(), to which the macro STANDARD_PARAM_DEF(byte, ...) expands, "%c" is used to print an unsigned char. So it gets printed as a character what is not intended here. Use "%hhu" instead. [Rusty: note drivers which would be effected: drivers/net/wireless/cw1200/main.c drivers/ntb/ntb_transport.c:68 drivers/scsi/lpfc/lpfc_attr.c drivers/usb/atm/speedtch.c drivers/usb/gadget/g_ffs.c ] Acked-by: Jon Mason <jon.mason@intel.com> (for ntb) Acked-by: Michal Nazarewicz <mina86@mina86.com> (for g_ffs.c) Signed-off-by: Christoph Jaeger <christophjaeger@linux.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2013-08-20 15:37:28 +09:30
Linus Torvalds	e91dade52b	Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: "Three small fixlets" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: nohz: fix compile warning in tick_nohz_init() nohz: Do not warn about unstable tsc unless user uses nohz_full sched_clock: Fix integer overflow	2013-08-19 09:17:35 -07:00
Randy Dunlap	2203547f82	kernel: fix new kernel-doc warning in wait.c Fix new kernel-doc warnings in kernel/wait.c: Warning(kernel/wait.c:374): No description found for parameter 'p' Warning(kernel/wait.c:374): Excess function parameter 'word' description in 'wake_up_atomic_t' Warning(kernel/wait.c:374): Excess function parameter 'bit' description in 'wake_up_atomic_t' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-08-19 09:08:54 -07:00
Tejun Heo	6e6eab0efd	cgroup: fix cgroup_write_event_control() 81eeaf0411 ("cgroup: make cftype->[un]register_event() deal with cgroup_subsys_state inst ead of cgroup") updated the cftype event methods to take @css (cgroup_subsys_state) instead of @cgroup; however, it incorrectly used @css passed to cgroup_write_event_control(), which the dummy_css for the cgroup as the file is a cgroup core file. This leads to oops on event registration. Fix it by using the css matching the event target file. Note that cgroup_write_event_control() now disallows cgroup core files from being event sources. This is for simplicity and doesn't matter as cgroup_event will be moved and made specific to memcg. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>	2013-08-19 09:56:34 -04:00
Tejun Heo	0bfb4aa67c	cgroup: fix subsystem file accesses on the root cgroup 105347ba5 ("cgroup: make cgroup_file_open() rcu_read_lock() around cgroup_css() and add cfent->css") added cfent->css to cache the associted cgroup_subsys_state across file operations. A cfent is associated with single css throughout its lifetime and the origimal commit initialized the cache pointer during cgroup_add_file() and verified that it matches the actual one in cgroup_file_open(). While this works fine for !root cgroups, it's broken for root cgroups as files in a root cgroup are created before the css's are associated with the cgroup and thus cgroup_css() call in cgroup_add_file() returns NULL associating all cfents in the root cgroup with NULL css. This makes cgroup_file_open() trigger WARN and fail with -ENODEV for all !core subsystem files in the root cgroups. There's no reason to initialize cfent->css separately from cgroup_add_file(). As the association never changes, cgroup_file_open() can set it unconditionally every time and containing the logic in cgroup_file_open() makes more sense anyway as the only reason it's necessary is file->private_data being already occupied. Fix it by setting cfent->css unconditionally from cgroup_file_open(). Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>	2013-08-19 09:56:25 -04:00
Li Zefan	1cb650b91b	cgroup: change cgroup_from_id() to css_from_id() Now we want cgroup core to always provide the css to use to the subsystems, so change this API to css_from_id(). Uninline css_from_id(), because it's getting bigger and cgroup_css() has been unexported. While at it, remove the #ifdef, and shuffle the order of the args. Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2013-08-19 09:52:18 -04:00
Xie XiuQi	15e71911fc	generic-ipi/locking: Fix misleading smp_call_function_any() description Fix locking description: after commit 8969a5ede0f9e17da4b9437 ("generic-ipi: remove kmalloc()"), wait = 0 can be guaranteed because we don't kmalloc() anymore. Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com> Cc: Sheng Yang <sheng@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Link: http://lkml.kernel.org/r/51F5E6F8.1000801@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-08-19 09:03:50 +02:00
Paul E. McKenney	217af2a2ff	nohz_full: Add full-system-idle arguments to API This commit adds an isidle and jiffies argument to force_qs_rnp(), dyntick_save_progress_counter(), and rcu_implicit_dynticks_qs() to enable RCU's force-quiescent-state process to check for full-system idle. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> [ paulmck: Use true and false for boolean constants per Lai Jiangshan. ] Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-08-18 18:59:03 -07:00
Paul E. McKenney	d4bd54fbac	nohz_full: Add full-system idle states and variables This commit adds control variables and states for full-system idle. The system will progress through the states in numerical order when the system is fully idle (other than the timekeeping CPU), and reset down to the initial state if any non-timekeeping CPU goes non-idle. The current state is kept in full_sysidle_state. One flavor of RCU will be in charge of driving the state machine, defined by rcu_sysidle_state. This should be the busiest flavor of RCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-08-18 18:58:51 -07:00
Paul E. McKenney	eb348b8982	nohz_full: Add per-CPU idle-state tracking This commit adds the code that updates the rcu_dyntick structure's new fields to track the per-CPU idle state based on interrupts and transitions into and out of the idle loop (NMIs are ignored because NMI handlers cannot cleanly read out the time anyway). This code is similar to the code that maintains RCU's idea of per-CPU idleness, but differs in that RCU treats CPUs running in user mode as idle, where this new code does not. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-08-18 18:58:43 -07:00
Paul E. McKenney	2333210b26	nohz_full: Add rcu_dyntick data for scalable detection of all-idle state This commit adds fields to the rcu_dyntick structure that are used to detect idle CPUs. These new fields differ from the existing ones in that the existing ones consider a CPU executing in user mode to be idle, where the new ones consider CPUs executing in user mode to be busy. The handling of these new fields is otherwise quite similar to that for the exiting fields. This commit also adds the initialization required for these fields. So, why is usermode execution treated differently, with RCU considering it a quiescent state equivalent to idle, while in contrast the new full-system idle state detection considers usermode execution to be non-idle? It turns out that although one of RCU's quiescent states is usermode execution, it is not a full-system idle state. This is because the purpose of the full-system idle state is not RCU, but rather determining when accurate timekeeping can safely be disabled. Whenever accurate timekeeping is required in a CONFIG_NO_HZ_FULL kernel, at least one CPU must keep the scheduling-clock tick going. If even one CPU is executing in user mode, accurate timekeeping is requires, particularly for architectures where gettimeofday() and friends do not enter the kernel. Only when all CPUs are really and truly idle can accurate timekeeping be disabled, allowing all CPUs to turn off the scheduling clock interrupt, thus greatly improving energy efficiency. This naturally raises the question "Why is this code in RCU rather than in timekeeping?", and the answer is that RCU has the data and infrastructure to efficiently make this determination. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-08-18 18:58:31 -07:00
Paul E. McKenney	b44379af1c	nohz_full: Add Kconfig parameter for scalable detection of all-idle state At least one CPU must keep the scheduling-clock tick running for timekeeping purposes whenever there is a non-idle CPU. However, with the new nohz_full adaptive-idle machinery, it is difficult to distinguish between all CPUs really being idle as opposed to all non-idle CPUs being in adaptive-ticks mode. This commit therefore adds a Kconfig parameter as a first step towards enabling a scalable detection of full-system idle state. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> [ paulmck: Update help text per Frederic Weisbecker. ] Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-08-18 18:07:02 -07:00
Paul E. McKenney	feed66ed26	rcu: Eliminate unused APIs intended for adaptive ticks The rcu_user_enter_after_irq() and rcu_user_exit_after_irq() functions were intended for use by adaptive ticks, but changes in implementation have rendered them unnecessary. This commit therefore removes them. Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-08-18 18:06:44 -07:00
Paul E. McKenney	1eafd31c64	rcu: Avoid redundant grace-period kthread wakeups When setting up an in-the-future "advanced" grace period, the code needs to wake up the relevant grace-period kthread, which it currently does unconditionally. However, this results in needless wakeups in the case where the advanced grace period is being set up by the grace-period kthread itself, which is a non-uncommon situation. This commit therefore checks to see if the running thread is the grace-period kthread, and avoids doing the irq_work_queue()-mediated wakeup in that case. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-08-18 17:40:09 -07:00
Paul E. McKenney	ae15018456	rcu: Make call_rcu() leak callbacks for debug-object errors If someone does a duplicate call_rcu(), the worst thing the second call_rcu() could do would be to actually queue the callback the second time because doing so corrupts whatever list the callback was already queued on. This commit therefore makes __call_rcu() check the new return value from debug-objects and leak the callback upon error. This commit also substitutes rcu_leak_callback() for whatever callback function was previously in place in order to avoid freeing the callback out from under any readers that might still be referencing it. These changes increase the probability that the debug-objects error messages will actually make it somewhere visible. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-08-18 17:40:03 -07:00
Paul E. McKenney	15100df81f	rcu: Simplify debug-objects fixups The current debug-objects fixups are complex and heavyweight, and the fixups are not complete: Even with the fixups, RCU's callback lists can still be corrupted. This commit therefore strips the fixups down to their minimal form, eliminating two of the three. It would be even better if (for example) call_rcu() simply leaked any problematic callbacks, but for that to happen, the debug-objects system would need to inform its caller of suspicious situations. This is the subject of a later commit in this series. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-08-18 17:39:45 -07:00
Borislav Petkov	d1d74d14e9	rcu: Expedite grace periods during suspend/resume CONFIG_RCU_FAST_NO_HZ can increase grace-period durations by up to a factor of four, which can result in long suspend and resume times. Thus, this commit temporarily switches to expedited grace periods when suspending the box and return to normal settings when resuming. Similar logic is applied to hibernation. Because expedited grace periods are of dubious benefit on very large systems, so this commit restricts their automated use during suspend and resume to systems of 256 or fewer CPUs. (Some day a number of Linux-kernel facilities, including RCU's expedited grace periods, will be more scalable, but I need to see bug reports first.) [ paulmck: This also papers over an audio/irq bug, but hopefully that will be fixed soon. ] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-08-18 17:37:17 -07:00
Linus Torvalds	50e37ccea0	Merge branch 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fix from Tejun Heo: "This contains one patch to fix the return value of cpuset's cgroups interface function, which used to always return -ENODEV for the writes on the 'memory_pressure_enabled' file" * 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cpuset: fix the return value of cpuset_write_u64()	2013-08-18 08:51:28 -07:00
Linus Torvalds	2d2843e614	Power management fix for 3.11-rc6 - The removal of delayed_work_pending() checks from kernel/power/qos.c done in 3.9 introduced a deadlock in pm_qos_work_fn(). Fix from Stephen Boyd. / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAABAgAGBQJSDi1zAAoJEKhOf7ml8uNsv1wQAIfa2cD6k7WaFrDpL8FTRatY 77qDudjeJnv24R9lRA7rk3FViTfLWUoKAmJrCtgaWV7AxxvtJur2L3Q1vq5QvF8j m44Dtn8WqyNezgnSoMMHW+SWfvYhduoF/U++8EZ4PschzNnm146cLVT4jjEu7twK btCqB+Qg/F5jfdv+HuUCLfjx1WP9OgXi3km97fKKuRPFPG86ykoxUoT6GSNZ3kT9 60eOhf840ULOgtOLZV6gLJTVlJFY3dNviLQZXF3x+1VXwoOzoV0y496OItX6IVye K6qN8T3IGdkqg7urBGRtskFc3IVutuUTY2UAxJjQqGOVl2W6Te7KSk8czTJp6hbl jF5p5S7m/V6Oj0021ndXpAmgb5yDWxM+qCOuXxfBScd+ZWn190+0Ok1PYVQEXOLT vczhn8b2OMojvC7bWiVFEAfxMwK/5qGI/+yeIIC8pf27TcrgfJYnCBd7YNXyTa3Z sfr4ITUnu+IJq6NlJtK7brzAd3270TWljZUn/zQESyC8U7b2zWPWE3U5hB7Sil85 rJ2U91deoBu2/gEhZcyFjSTzikc9rhMZQHJ/BvzwMraUko+1uDHM+PPaXv3V8Q6c SSvizjx4QGlTrr/PiXKMFTQO1ArwBJvy2r8NJLGPKSaUKAelU0wYSrUoUkhI/CtT v4p4xFwXGObOyv4UFg3H =mGG3 -----END PGP SIGNATURE----- Merge tag 'pm-3.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "The removal of delayed_work_pending() checks from kernel/power/qos.c done in 3.9 introduced a deadlock in pm_qos_work_fn(). Fix from Stephen Boyd" * tag 'pm-3.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()	2013-08-16 09:59:00 -07:00
Peter Zijlstra	5ec4c599a5	perf: Do not compute time values unnecessarily We should not be calling calc_timer_values() for events that do not actually have an mmap()'ed userpage. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130802191630.GT27162@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-08-16 17:55:52 +02:00
Frederic Weisbecker	948b26b6dd	perf: Account freq events globally Freq events may not always be affine to a particular CPU. As such, account_event_cpu() may crash if we account per cpu a freq event that has event->cpu == -1. To solve this, lets account freq events globally. In practice this doesn't change much the picture because perf tools create per-task perf events with one event per CPU by default. Profiling a single CPU is usually a corner case so there is no much point in optimizing things that way. Reported-by: Jiri Olsa <jolsa@redhat.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1375460996-16329-3-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-08-16 17:55:51 +02:00
Frederic Weisbecker	fc3b86d673	perf: Roll back callchain buffer refcount under the callchain mutex When we fail to allocate the callchain buffers, we roll back the refcount we did and return from get_callchain_buffers(). However we take the refcount and allocate under the callchain lock but the rollback is done outside the lock. As a result, while we roll back, some concurrent callchain user may call get_callchain_buffers(), see the non-zero refcount and give up because the buffers are NULL without itself retrying the allocation. The consequences aren't that bad but that behaviour looks weird enough and it's better to give their chances to the following callchain users where we failed. Reported-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1375460996-16329-2-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-08-16 17:55:50 +02:00
Frederic Weisbecker	c2e7fcf53c	nohz: Include local CPU in full dynticks global kick tick_nohz_full_kick_all() is useful to notify all full dynticks CPUs that there is a system state change to checkout before re-evaluating the need for the tick. Unfortunately this is implemented using smp_call_function_many() that ignores the local CPU. This CPU also needs to re-evaluate the tick. on_each_cpu_mask() is not useful either because we don't want to re-evaluate the tick state in place but asynchronously from an IPI to avoid messing up with any random locking scenario. So lets call tick_nohz_full_kick() from tick_nohz_full_kick_all() so that the usual irq work takes care of it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1375460996-16329-4-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-08-16 17:55:33 +02:00
Christoph Lameter	a4f61cc03e	sched/cputime: Use this_cpu_add() in task_group_account_field() Use of a this_cpu() operation reduces the number of instructions used for accounting (account_user_time()) and frees up some registers. This is in the scheduler tick hotpath. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/00000140596dd165-338ff7f5-893b-4fec-b251-aaac5557239e-000000@email.amazonses.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-08-16 17:44:29 +02:00
Xiaotian Feng	c8d2d47a9c	cpumask: Fix cpumask leak in partition_sched_domains() If doms_new is NULL, partition_sched_domains() will reset ndoms_cur to 0, and free old sched domains with free_sched_domains(doms_cur, ndoms_cur). As ndoms_cur is 0, the cpumask will not be freed. Signed-off-by: Xiaotian Feng <xtfeng@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: linux-kernel@vger.kernel.org Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1375790802-11857-1-git-send-email-xtfeng@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-08-16 17:44:27 +02:00
Ingo Molnar	d3ec3a1fd0	Linux 3.11-rc5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQEcBAABAgAGBQJSCDSjAAoJEHm+PkMAQRiGDXMIAI7Loae0Oqb1eoeJkvjyZsBS OJDeeEcn+k58VbxVHyRdc7hGo4yI4tUZm172SpnOaM8sZ/ehPU7zBrwJK2lzX334 /jAM3uvVPfxA2nu0I4paNpkED/NQ8NRRsYE1iTE8dzHXOH6dA3mgp5qfco50rQvx rvseXpME4KIAJEq4jnyFZF5+nuHiPueM9JftPmSSmJJ3/KY9kY1LESovyWd7ttg1 jYSVPFal9J0E+tl2UQY5g9H16GqhhjYn+39Iei6Q5P4bL4ZubQgTRQTN9nyDc06Z ezQtGoqZ8kEz/2SyRlkda6PzjSEhgXlc8mCL5J7AW+dMhTHHx2IrosjiCA80kG8= =c0rK -----END PGP SIGNATURE----- Merge tag 'v3.11-rc5' into sched/core Merge Linux 3.11-rc5, to pick up the latest fixes. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-08-16 17:40:23 +02:00
Li Zhong	930913a312	cgroup: use css_get() in cgroup_create() to check CSS_ROOT It seems that the root css doesn't have refcnt allocated(not needed?), and would cause the booting error attached. This patch tries to use css_get() to not increase the refcnt if parent is root. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff810b37cc>] cgroup_mkdir+0x37c/0x740 PGD 0 Oops: 0002 [#1] Modules linked in: CPU: 0 PID: 1 Comm: systemd Not tainted 3.11.0-rc5-next-20130815+ #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 task: ffff88007f868000 ti: ffff88007f864000 task.ti: ffff88007f864000 RIP: 0010:[<ffffffff810b37cc>] [<ffffffff810b37cc>] cgroup_mkdir+0x37c/0x740 RSP: 0018:ffff88007f865df8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffffff81a46ee0 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81a415c0 RBP: ffff88007f865ec8 R08: 0000000000000001 R09: 0000000000000000 R10: ffff88007ce6d060 R11: 0000000000000000 R12: ffff88007ce6d000 R13: ffff88007ce6d060 R14: ffffffff81a46d80 R15: ffff88007c6e8018 FS: 00007f13dbf6f840(0000) GS:ffffffff81a23000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000007b7e5000 CR4: 00000000000006b0 Stack: ffffffff810b380d 0000000000000002 ffff88007f865e18 ffffffff81167069 ffff88007f865ed8 ffffffff8116a3f5 ffff880037454400 ffff88007c6e8018 ffff88007c6e8028 ffff88007c6e8328 ffff88007c6e8000 ffff88007ce6d000 Call Trace: [<ffffffff810b380d>] ? cgroup_mkdir+0x3bd/0x740 [<ffffffff81167069>] ? lookup_hash+0x19/0x20 [<ffffffff8116a3f5>] ? kern_path_create+0x95/0x170 [<ffffffff8116ce3e>] vfs_mkdir+0x9e/0xf0 [<ffffffff8116d7a0>] SyS_mkdirat+0x60/0xe0 [<ffffffff8116d839>] SyS_mkdir+0x19/0x20 [<ffffffff814c960d>] tracesys+0xcf/0xd4 Code: ad 70 ff ff ff 48 89 9d 60 ff ff ff 4d 89 d5 4c 8b bd 68 ff ff ff 4c 8b 65 88 eb 50 0f 1f 00 48 8b 43 18 a8 03 0f 85 6c 03 00 00 <ff> 00 e8 1d 0a fb ff 85 c0 74 0d 80 3d f0 45 a1 00 00 0f 84 4c RIP [<ffffffff810b37cc>] cgroup_mkdir+0x37c/0x740 RSP <ffff88007f865df8> CR2: 0000000000000000 ---[ end trace a4b14b49bc46fd60 ]--- Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Acked-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2013-08-16 10:55:31 -04:00
Dwight Engen	fd5e2aa865	xfs: ioctl check for capabilities in the current user namespace Use inode_capable() to check if SUID\|SGID bits should be cleared to match similar check in inode_change_ok(). The check for CAP_LINUX_IMMUTABLE was not modified since all other file systems also check against init_user_ns rather than current_user_ns. Only allow changing of projid from init_user_ns. Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Dwight Engen <dwight.engen@oracle.com> Signed-off-by: Ben Myers <bpm@sgi.com>	2013-08-15 14:19:25 -05:00
Ingo Molnar	c9572f010d	Linux 3.11-rc5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQEcBAABAgAGBQJSCDSjAAoJEHm+PkMAQRiGDXMIAI7Loae0Oqb1eoeJkvjyZsBS OJDeeEcn+k58VbxVHyRdc7hGo4yI4tUZm172SpnOaM8sZ/ehPU7zBrwJK2lzX334 /jAM3uvVPfxA2nu0I4paNpkED/NQ8NRRsYE1iTE8dzHXOH6dA3mgp5qfco50rQvx rvseXpME4KIAJEq4jnyFZF5+nuHiPueM9JftPmSSmJJ3/KY9kY1LESovyWd7ttg1 jYSVPFal9J0E+tl2UQY5g9H16GqhhjYn+39Iei6Q5P4bL4ZubQgTRQTN9nyDc06Z ezQtGoqZ8kEz/2SyRlkda6PzjSEhgXlc8mCL5J7AW+dMhTHHx2IrosjiCA80kG8= =c0rK -----END PGP SIGNATURE----- Merge tag 'v3.11-rc5' into perf/core Merge Linux 3.11-rc5, to sync up with the latest upstream fixes since -rc1. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-08-15 10:00:09 +02:00
Linus Torvalds	f1d6e17f54	Merge branch 'akpm' (patches from Andrew Morton) Merge a bunch of fixes from Andrew Morton. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: fs/proc/task_mmu.c: fix buffer overflow in add_page_map() arch: : Kconfig: add "kernel/Kconfig.freezer" to "arch//Kconfig" ocfs2: fix null pointer dereference in ocfs2_dir_foreach_blk_id() x86 get_unmapped_area(): use proper mmap base for bottom-up direction ocfs2: fix NULL pointer dereference in ocfs2_duplicate_clusters_by_page ocfs2: Revert 40bd62e to avoid regression in extended allocation drivers/rtc/rtc-stmp3xxx.c: provide timeout for potentially endless loop polling a HW bit hugetlb: fix lockdep splat caused by pmd sharing aoe: adjust ref of head for compound page tails microblaze: fix clone syscall mm: save soft-dirty bits on file pages mm: save soft-dirty bits on swapped pages memcg: don't initialize kmem-cache destroying work for root caches	2013-08-14 10:04:43 -07:00
Ingo Molnar	6f1d657668	Merge branch 'timers/nohz-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/nohz Pull nohz improvements from Frederic Weisbecker: " It mostly contains fixes and full dynticks off-case optimizations. I believe that distros want to enable this feature so it seems important to optimize the case where the "nohz_full=" parameter is empty. ie: I'm trying to remove any performance regression that comes with NO_HZ_FULL=y when the feature is not used. This patchset improves the current situation a lot (off-case appears to be around 11% faster with hackbench, although I guess it may vary depending on the configuration but it should be significantly faster in any case) now there is still some work to do: I can still observe a remaining loss of 1.6% throughput seen with hackbench compared to CONFIG_NO_HZ_FULL=n. " Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-08-14 17:58:56 +02:00
Frederic Weisbecker	d13508f944	nohz: Optimize full dynticks's sched hooks with static keys Scheduler IPIs and task context switches are serious fast path. Let's try to hide as much as we can the impact of full dynticks APIs' off case that are called on these sites through the use of static keys. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>	2013-08-14 17:14:58 +02:00
Frederic Weisbecker	460775df46	nohz: Optimize full dynticks state checks with static keys These APIs are frequenctly accessed and priority is given to optimize the full dynticks off-case in order to let distros enable this feature without suffering from significant performance regressions. Let's inline these APIs and optimize them with static keys. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>	2013-08-14 17:14:57 +02:00
Frederic Weisbecker	73867dcd07	nohz: Rename a few state variables Rename the full dynticks's cpumask and cpumask state variables to some more exportable names. These will be used later from global headers to optimize the main full dynticks APIs in conjunction with static keys. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>	2013-08-14 17:14:57 +02:00
Frederic Weisbecker	af2350bd12	vtime: Always debug check snapshot source _before_ updating it The vtime delta update performed by get_vtime_delta() always check that the source of the snapshot is valid. Meanhile the snapshot updaters that rely on get_vtime_delta() also set the new snapshot origin. But some of them do this right before the call to get_vtime_delta(), making its debug check useless. This is easily fixable by moving the snapshot origin update after the call to get_vtime_delta(). The order doesn't matter there. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>	2013-08-14 17:14:56 +02:00
Frederic Weisbecker	b854fafa4e	vtime: Always scale generic vtime accounting results The cputime accounting in full dynticks can be a subtle mixup of CPUs using tick based accounting and others using generic vtime. As long as the tick can have a share on producing these stats, we want to scale the result against CFS precise accounting as the tick can miss some task hiding between the periodic interrupt. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>	2013-08-14 17:14:55 +02:00
Frederic Weisbecker	b049340613	vtime: Optimize full dynticks accounting off case with static keys If no CPU is in the full dynticks range, we can avoid the full dynticks cputime accounting through generic vtime along with its overhead and use the traditional tick based accounting instead. Let's do this and nope the off case with static keys. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>	2013-08-14 17:14:54 +02:00
Frederic Weisbecker	54461562c9	vtime: Fix racy cputime delta update get_vtime_delta() must be called under the task vtime_seqlock with the code that does the cputime accounting flush. Otherwise the cputime reader can be fooled and run into a race where it sees the snapshot update but misses the cputime flush. As a result it can report a cputime that is way too short. Fix vtime_account_user() that wasn't complying to that rule. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>	2013-08-14 17:14:50 +02:00
Frederic Weisbecker	7621d1f8bc	vtime: Remove a few unneeded generic vtime state checks Some generic vtime APIs check if the vtime accounting is enabled on the local CPU before doing their work. Some of these are not needed because all their callers already take care of that. Let's remove the checks on these. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>	2013-08-14 17:14:49 +02:00
Frederic Weisbecker	1b6a259aa5	context_tracking: User/kernel broundary cross trace events This can be useful to track all kernel/user round trips. And it's also helpful to debug the context tracking subsystem. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>	2013-08-14 17:14:48 +02:00
Frederic Weisbecker	73d424f9af	context_tracking: Optimize context switch off case with static keys No need for syscall slowpath if no CPU is full dynticks, rather nop this in this case. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>	2013-08-14 17:14:47 +02:00
Frederic Weisbecker	48d6a816a8	context_tracking: Optimize guest APIs off case with static key Optimize guest entry/exit APIs with static keys. This minimize the overhead for those who enable CONFIG_NO_HZ_FULL without always using it. Having no range passed to nohz_full= should result in the probes overhead to be minimized. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>	2013-08-14 17:14:46 +02:00
Frederic Weisbecker	ad65782fba	context_tracking: Optimize main APIs off case with static key Optimize user and exception entry/exit APIs with static keys. This minimize the overhead for those who enable CONFIG_NO_HZ_FULL without always using it. Having no range passed to nohz_full= should result in the probes to be nopped (at least we hope so...). If this proves not be enough in the long term, we'll need to bring an exception slow path by re-routing the exception handlers. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>	2013-08-14 17:14:45 +02:00

... 2 3 4 5 6 ...

16560 Commits