28 Commits

Author SHA1 Message Date
Tobin C. Harding
c73dff595f leaking_addresses: check if file name contains address
Sometimes files may be created by using output from printk.  As the scan
traverses the directory tree we should parse each path name and check if
it is leaking an address.

Add check for leaking address on each path name.

Suggested-by: Tycho Andersen <tycho@tycho.ws>
Acked-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
2306a67745 leaking_addresses: explicitly name variable used in regex
Currently sub routine may_leak_address() is checking regex against Perl
special variable $_ which is _fortunately_ being set correctly in a loop
before this sub routine is called.  We already have declared a variable
to hold this value '$line' we should use it.

Use $line in regex match instead of implicit $_

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
3482737449 leaking_addresses: remove version number
We have git now, we don't need a version number.  This was originally
added because leaking_addresses.pl shamelessly (and mindlessly) copied
checkpatch.pl

Remove version number from script.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
2ad7429392 leaking_addresses: skip '/proc/1/syscall'
The pointers listed in /proc/1/syscall are user pointers, and negative
syscall args will show up like kernel addresses.

For example

/proc/31808/syscall: 0 0x3 0x55b107a38180 0x2000 0xffffffffffffffb0 \
0x55b107a302d0 0x55b107a38180 0x7fffa313b8e8 0x7ff098560d11

Skip parsing /proc/1/syscall

Suggested-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
472c9e1085 leaking_addresses: skip all /proc/PID except /proc/1
When the system is idle it is likely that most files under /proc/PID
will be identical for various processes.  Scanning _all_ the PIDs under
/proc is unnecessary and implies that we are thoroughly scanning /proc.
This is _not_ the case because there may be ways userspace can trigger
creation of /proc files that leak addresses but were not present during
a scan.  For these two reasons we should exclude all PID directories
under /proc except '1/'

Exclude all /proc/PID except /proc/1.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
5e4bac34ed leaking_addresses: cache architecture name
Currently we are repeatedly calling `uname -m`.  This is causing the
script to take a long time to run (more than 10 seconds to parse
/proc/kallsyms).  We can use Perl state variables to cache the result of
the first call to `uname -m`.  With this change in place the script
scans the whole kernel in under a minute.

Cache machine architecture in state variable.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
b401f56f33 leaking_addresses: simplify path skipping
Currently script has multiple configuration arrays.  This is confusing,
evident by the fact that a bunch of the entries are in the wrong place.
We can simplify the code by just having a single array for absolute
paths to skip and a single array for file names to skip wherever they
appear in the scanned directory tree.  There are also currently multiple
subroutines to handle the different arrays, we can reduce these to a
single subroutine also.

Simplify the path skipping code.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
e2858caddc leaking_addresses: do not parse binary files
Currently script parses binary files.  Since we are scanning for
readable kernel addresses there is no need to parse binary files.  We
can use Perl to check if file is binary and skip parsing it if so.

Do not parse binary files.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
1410fe4eea leaking_addresses: add 32-bit support
Currently script only supports x86_64 and ppc64.  It would be nice to be
able to scan 32-bit machines also.  We can add support for 32-bit
architectures by modifying how we check for false positives, taking
advantage of the page offset used by the kernel, and using the correct
regular expression.

Support for 32-bit machines is enabled by the observation that the kernel
addresses on 32-bit machines are larger [in value] than the page offset.
We can use this to filter false positives when scanning the kernel for
leaking addresses.

Programmatic determination of the running architecture is not
immediately obvious (current 32-bit machines return various strings from
`uname -m`).  We therefore provide a flag to enable scanning of 32-bit
kernels.  Also we can check the kernel config file for the offset and if
not found default to 0xc0000000.  A command line option to parse in the
page offset is also provided.  We do automatically detect architecture
if running on ix86.

Add support for 32-bit kernels.  Add a command line option for page
offset.

Suggested-by: Kaiwan N Billimoria <kaiwan.billimoria@gmail.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
5eb0da0568 leaking_addresses: add is_arch() wrapper subroutine
Currently there is duplicate code when checking the architecture type.
We can remove the duplication by implementing a wrapper function
is_arch().

Implement and use wrapper function is_arch().

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
6efb745828 leaking_addresses: use system command to get arch
Currently script uses Perl to get the machine architecture. This can be
erroneous since Perl uses the architecture of the machine that Perl was
compiled on not the architecture of the running machine. We should use
the systems `uname` command instead.

Use `uname -m` instead of Perl to get the machine architecture.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
2f042c93a1 leaking_addresses: add support for 5 page table levels
Currently script only supports 4 page table levels because of the way
the kernel address regular expression is crafted. We can do better than
this. Using previously added support for kernel configuration options we
can get the number of page table levels defined by
CONFIG_PGTABLE_LEVELS. Using this value a correct regular expression can
be crafted. This only supports 5 page tables on x86_64.

Add support for 5 page table levels on x86_64.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
f9d2a42dac leaking_addresses: add support for kernel config file
Features that rely on the ability to get kernel configuration options
are ready to be implemented in script. In preparation for this we can
add support for kernel config options as a separate patch to ease
review.

Add support for locating and parsing kernel configuration file.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
87e3758856 leaking_addresses: add range check for vsyscall memory
Currently script checks only first and last address in the vsyscall
memory range. We can do better than this. When checking for false
positives against $match, we can convert $match to a hexadecimal value
then check if it lies within the range of vsyscall addresses.

Check whole range of vsyscall addresses when checking for false
positive.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
15d60a35b8 leaking_addresses: indent dependant options
A number of the command line options to script are dependant on the
option --input-raw being set. If we indent these options it makes
explicit this dependency.

Indent options dependant on --input-raw.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
6145de836a leaking_addresses: remove command examples
Currently help output includes command examples. These were cute when we
first started development of this script but are unnecessary.

Remove command examples.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
20cdfb5fc4 leaking_addresses: remove mention of kptr_restrict
leaking_addresses.pl can be run with kptr_restrict==0 now, we don't need
the comment about setting kptr_restrict any more.

Remove comment suggesting setting kptr_restrict.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
6d23dd9bbb leaking_addresses: fix typo function not called
Currently code uses a check against an undefined variable because the
variable is a sub routine name and is not evaluated.

Evaluate subroutine; add parenthesis to sub routine name.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2018-04-07 08:50:34 +10:00
Tobin C. Harding
a11949ec20 leaking_addresses: add SigIgn to false positives
Signal masks are false positives, we already check for SigBlk and SigCgt
but we missed SigIgn.

Add SigIgn to false positive check.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2017-11-14 09:29:27 +11:00
Tobin C. Harding
dd98c252ae leaking_addresses: add timeout on file read
Currently script can stall if we read certain files (like
/proc/kmsg). While we have a mechanism to skip these files once they are
discovered it would be nice to not stall on as yet undiscovered files of
this kind.

Set a timer before each file is parsed, warn user if timer expires.

Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
2017-11-14 09:29:27 +11:00
Tobin C. Harding
62139c1242 leaking_addresses: add support for ppc64
Currently script is targeted at x86_64. We can support other
architectures by using the correct regular expressions for each
architecture.

Add the infrastructure to support multiple architectures. Add support
for ppc64.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2017-11-14 09:29:27 +11:00
Tobin C. Harding
d09bd8da88 leaking_addresses: add summary reporting options
Currently script just dumps all results found. Potentially, this risks
losing single results among multiple duplicate results. We need some
way of restricting duplicates to assist users of the script. It would
also be nice if we got a report instead of raw results.

Duplicates can be defined in various ways, instead of trying to find a
single perfect solution we can present the user with various options to
display the output. Doing so will typically lead to users wanting to
view the output multiple times. Currently we scan the kernel each time,
this is slow and unnecessary. We can expedite the process by writing the
results to file for subsequent viewing.

Add command line options to enable summary reporting, including options
to write to and read from file.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2017-11-14 09:29:27 +11:00
Tobin C. Harding
1c1e3be0bf leaking_addresses: add to exclude files/paths list
There are a couple more files that cause the script to stall.

/sys/firmware/devicetree and its symlink /proc/device-tree, reported by
Michael Ellerman.

usbmon should be skipped were ever it appears. Reported by Kees Cook

Add files to be excluded from parsing.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2017-11-14 09:29:27 +11:00
Tobin C. Harding
a284733e26 leaking_addresses: fix comment string typo
Fix typo in comment string.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2017-11-14 09:29:27 +11:00
Tobin C. Harding
ecd39dbd27 leaking_addresses: remove command line options
Currently script accepts files to skip. This was added to make running
the script faster (for repeat runs). We can remove this functionality in
preparation for adding sub commands (scan and format) to the script.

Remove command line options.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2017-11-14 09:29:27 +11:00
Tobin C. Harding
fa31a58202 leaking_addresses: remove dead/unused code
debug_arrays is not called. Also, %seen hash is not used. We should
remove unused code.

Remove dead code.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2017-11-14 09:29:27 +11:00
Tobin C. Harding
7e5758f7f7 leaking_addresses: use tabs instead of spaces
Current code uses spaces instead of tabs in places.

Use tabs instead of spaces.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2017-11-14 09:29:27 +11:00
Tobin C. Harding
136fc5c41f scripts: add leaking_addresses.pl
Currently we are leaking addresses from the kernel to user space. This
script is an attempt to find some of those leakages. Script parses
`dmesg` output and /proc and /sys files for hex strings that look like
kernel addresses.

Only works for 64 bit kernels, the reason being that kernel addresses on
64 bit kernels have 'ffff' as the leading bit pattern making greping
possible. On 32 kernels we don't have this luxury.

Scripts is _slightly_ smarter than a straight grep, we check for false
positives (all 0's or all 1's, and vsyscall start/finish addresses).

[ I think there is a lot of room for improvement here, but it's already
  useful, so I'm merging it as-is. The whole "hash %p format" series is
  expected to go into 4.15, but will not fix %x users, and will not
  incentivize people to look at what they are leaking.     - Linus ]

Signed-off-by: Tobin C. Harding <me@tobin.cc>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-06 11:46:42 -08:00