Commit Graph

21 Commits

Author SHA1 Message Date
Dave Jiang
c0d1217202 drivers/edac: add new nmi rescan
Provides a way for NMI reported errors on x86 to notify the EDAC
subsystem pending ECC errors by writing to a software state variable.

Here's the reworked patch. I added an EDAC stub to the kernel so we can
have variables that are in the kernel even if EDAC is a module. I also
implemented the idea of using the chip driver to select error detection
mode via module parameter and eliminate the kernel compile option.
Please review/test. Thx!

Also, I only made changes to some of the chipset drivers since I am
unfamiliar with the other ones. We can add similar changes as we go.

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Mike Chan
84db003f24 [PATCH] EDAC: Fix in e752x mc driver
This fix/change returns the offset into the page for the ce/ue error, instead
of just 0.  The e752x dram controller reads 34:6 of the linear address with
the error.

Signed-off-by: Mike Chan <mikechan@google.com>
Signed-off-by: doug thompson <norsk5@xmission.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:32 -08:00
Brian Pomerantz
9962fd017b [PATCH] EDAC: e752x byte access fix
The reading of the DRA registers should be a byte at a time (one register at a
time) instead of 4 bytes at a time (four registers).  Reading a dword at a
time retrieves erroneous information from all but the first register.  A
change was made to read in each register in a loop prior to using the data in
those registers.

Signed-off-by: Brian Pomerantz <bapper@mvista.com>
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Doug Thompson <norsk5@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:32 -08:00
Brian Pomerantz
dfb2a76378 [PATCH] EDAC: e752x bit mask fix
The fatal vs.  non-fatal mask for the sysbus FERR status is incorrect
according to the E7520 datasheet.  This patch corrects the mask to correctly
handle fatal and non-fatal errors.

Signed-off-by: Brian Pomerantz <bapper@mvista.com>
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Doug Thompson <norsk5@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:32 -08:00
Doug Thompson
929a40ec32 [PATCH] EDAC: fix module names quoted in sysfs
Fix the quoted module name in the sysfs for EDAC modules and reported by several
people.

Instead of  ../_edac_e752x_/   now the following will be presented, like other
modules:   ../edac_e752x/

Signed-off-by: Doug Thompson <norsk5@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-01 09:55:58 -07:00
Linus Torvalds
22a3e233ca Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
  Remove obsolete #include <linux/config.h>
  remove obsolete swsusp_encrypt
  arch/arm26/Kconfig typos
  Documentation/IPMI typos
  Kconfig: Typos in net/sched/Kconfig
  v9fs: do not include linux/version.h
  Documentation/DocBook/mtdnand.tmpl: typo fixes
  typo fixes: specfic -> specific
  typo fixes in Documentation/networking/pktgen.txt
  typo fixes: occuring -> occurring
  typo fixes: infomation -> information
  typo fixes: disadvantadge -> disadvantage
  typo fixes: aquire -> acquire
  typo fixes: mecanism -> mechanism
  typo fixes: bandwith -> bandwidth
  fix a typo in the RTC_CLASS help text
  smb is no longer maintained

Manually merged trivial conflict in arch/um/kernel/vmlinux.lds.S
2006-06-30 15:39:30 -07:00
Doug Thompson
1318952514 [PATCH] EDAC: probe1 cleanup 1-of-2
- Add lower-level functions that handle various parts of the initialization
  done by the xxx_probe1() functions.  Some of the xxx_probe1() functions are
  much too long and complicated (see "Chapter 5: Functions" in
  Documentation/CodingStyle).

- Cleanup of probe1() functions in EDAC

Signed-off-by: Doug Thompson <norsk5@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30 11:25:39 -07:00
Doug Thompson
2d7bbb91c8 [PATCH] EDAC: mc numbers refactor 1-of-2
Remove add_mc_to_global_list().  In next patch, this function will be
reimplemented with different semantics.

1 Reimplement add_mc_to_global_list() with semantics that allow the caller to
  determine the ID number for a mem_ctl_info structure.  Then modify
  edac_mc_add_mc() so that the caller specifies the ID number for the new
  mem_ctl_info structure.  Platform-specific code should be able to assign the
  ID numbers in a platform-specific manner.  For instance, on Opteron it makes
  sense to have the ID of the mem_ctl_info structure match the ID of the node
  that the memory controller belongs to.

2 Modify callers of edac_mc_add_mc() so they use the new semantics.

Signed-off-by: Doug Thompson <norsk5@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30 11:25:39 -07:00
Doug Thompson
37f04581ab [PATCH] EDAC: PCI device to DEVICE cleanup
Change MC drivers from using CVS revision strings for their version number,
Now each driver has its own local string.

Remove some PCI dependencies from the core EDAC module.  Made the code 'struct
device' centric instead of 'struct pci_dev' Most of the code changes here are
from a patch by Dave Jiang.  It may be best to eventually move the
PCI-specific code into a separate source file.

Signed-off-by: Doug Thompson <norsk5@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30 11:25:39 -07:00
Jörn Engel
6ab3d5624e Remove obsolete #include <linux/config.h>
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 19:25:36 +02:00
mark gross
96941026a5 [PATCH] EDAC Coexistence with BIOS
Address the issue of EDAC/BIOS coexistence for the e752x chip-sets.

We have found a problem where the BIOS will start the system with the error
registers (dev0:fun1) hidden and assuming it has exclusive access to them.
The edac driver violates this assumption.

The workaround this patch offers is to honor the hidden-ness as an
indication that it is not safe to use those registers.

Signed-off-by: Mark Gross <mark.gross@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-03 20:05:41 -07:00
Dave Peterson
e009356f73 [PATCH] EDAC: use sysbus_message in e752x code
Patch from Dave Jiang <djiang@mvista.com>: Fix EDAC e752x driver so it
outputs sysbus-specific error message when sysbus error detected.

Signed-off-by: David S. Peterson <dsp@llnl.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:08 -08:00
Dave Peterson
e7ecd89102 [PATCH] EDAC: formatting cleanup
Cosmetic indentation/formatting cleanup for EDAC code.  Make sure we
are using tabs rather than spaces to indent, etc.

Signed-off-by: David S. Peterson <dsp@llnl.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:08 -08:00
Dave Peterson
18dbc337af [PATCH] EDAC: protect memory controller list
- Fix code so we always hold mem_ctls_mutex while we are stepping
  through the list of mem_ctl_info structures.  Otherwise bad things
  may happen if one task is stepping through the list while another
  task is modifying it.  We may eventually want to use reference
  counting to manage the mem_ctl_info structures.  In the meantime we
  may as well fix this bug.

- Don't disable interrupts while we are walking the list of
  mem_ctl_info structures in check_mc_devices().  This is unnecessary.

Signed-off-by: David S. Peterson <dsp@llnl.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:07 -08:00
Dave Peterson
749ede5744 [PATCH] EDAC: cleanup code for clearing initial errors
Fix xxx_probe1() functions so they call xxx_get_error_info() functions
to clear initial errors.  This is simpler and cleaner than duplicating
the low-level code for accessing PCI config space.

Signed-off-by: David S. Peterson <dsp@llnl.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:07 -08:00
Dave Peterson
3847bccce8 [PATCH] EDAC: e752x cleanup
- Add ctl_dev field to "struct e752x_dev_info".  Then we can eliminate
  ugly switch statement from e752x_probe1().

- Remove code from e752x_probe1() that clears initial PCI bus parity
  errors.  The core EDAC module already does this.

Signed-off-by: David S. Peterson <dsp@llnl.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:06 -08:00
Dave Peterson
680cbbbb0e [PATCH] EDAC: name cleanup
Perform the following name substitutions on all source files:

    sed 's/BS_MOD_STR/EDAC_MOD_STR/g'
    sed 's/bs_thread_info/edac_thread_info/g'
    sed 's/bs_thread/edac_thread/g'
    sed 's/bs_xstr/edac_xstr/g'
    sed 's/bs_str/edac_str/g'

The names that start with BS_ or bs_ are artifacts of when the code
was called "bluesmoke".

Signed-off-by: David S. Peterson <dsp@llnl.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:06 -08:00
Dave Peterson
537fba2892 [PATCH] EDAC: printk cleanup
This implements the following idea:

On Monday 30 January 2006 19:22, Eric W. Biederman wrote:
> One piece missing from this conversation is the issue that we need errors
> in a uniform format.  That is why edac_mc has helper functions.
>
> However there will always be errors that don't fit any particular model.
> Could we add a edac_printk(dev, );  That is similar to dev_printk but
> prints out an EDAC header and the device on which the error was found?
> Letting the rest of the string be user specified.
>
> For actual control that interface may be to blunt, but at least for people
> looking in the logs it allows all of the errors to be detected and
> harvested.

Signed-off-by: David S. Peterson <dsp@llnl.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:06 -08:00
Randy Dunlap
0d38b049fe [PATCH] edac: use C99 initializers (sparse warnings)
drivers/edac/e752x_edac.c:1042:7: warning: obsolete struct initializer, use C99 syntax

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-03 08:32:06 -08:00
Alan Cox
da9bb1d27b [PATCH] EDAC: core EDAC support code
This is a subset of the bluesmoke project core code, stripped of the NMI work
which isn't ready to merge and some of the "interesting" proc functionality
that needs reworking or just has no place in kernel.  It requires no core
kernel changes except the added scrub functions already posted.

The goal is to merge further functionality only after the core code is
accepted and proven in the base kernel, and only at the point the upstream
extras are really ready to merge.

From: doug thompson <norsk5@xmission.com>

  This converts EDAC to sysfs and is the final chunk neccessary before EDAC
  has a stable user space API and can be considered for submission into the
  base kernel.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: doug thompson <norsk5@xmission.com>
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 19:20:31 -08:00
Alan Cox
806c35f505 [PATCH] EDAC: drivers for AMD 76x and Intel E750x, E752x
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 19:20:31 -08:00