linux/fs/reiserfs
Greg Kroah-Hartman b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
..
2017-04-19 14:21:23 +02:00

[LICENSING]

ReiserFS is hereby licensed under the GNU General
Public License version 2.

Source code files that contain the phrase "licensing governed by
reiserfs/README" are "governed files" throughout this file.  Governed
files are licensed under the GPL.  The portions of them owned by Hans
Reiser, or authorized to be licensed by him, have been in the past,
and likely will be in the future, licensed to other parties under
other licenses.  If you add your code to governed files, and don't
want it to be owned by Hans Reiser, put your copyright label on that
code so the poor blight and his customers can keep things straight.
All portions of governed files not labeled otherwise are owned by Hans
Reiser, and by adding your code to it, widely distributing it to
others or sending us a patch, and leaving the sentence in stating that
licensing is governed by the statement in this file, you accept this.
It will be a kindness if you identify whether Hans Reiser is allowed
to license code labeled as owned by you on your behalf other than
under the GPL, because he wants to know if it is okay to do so and put
a check in the mail to you (for non-trivial improvements) when he
makes his next sale.  He makes no guarantees as to the amount if any,
though he feels motivated to motivate contributors, and you can surely
discuss this with him before or after contributing.  You have the
right to decline to allow him to license your code contribution other
than under the GPL.

Further licensing options are available for commercial and/or other
interests directly from Hans Reiser: hans@reiser.to.  If you interpret
the GPL as not allowing those additional licensing options, you read
it wrongly, and Richard Stallman agrees with me, when carefully read
you can see that those restrictions on additional terms do not apply
to the owner of the copyright, and my interpretation of this shall
govern for this license.

Finally, nothing in this license shall be interpreted to allow you to
fail to fairly credit me, or to remove my credits, without my
permission, unless you are an end user not redistributing to others.
If you have doubts about how to properly do that, or about what is
fair, ask.  (Last I spoke with him Richard was contemplating how best
to address the fair crediting issue in the next GPL version.)

[END LICENSING]

Reiserfs is a file system based on balanced tree algorithms, which is
described at https://reiser4.wiki.kernel.org/index.php/Main_Page 

Stop reading here.  Go there, then return.

Send bug reports to yura@namesys.botik.ru.

mkreiserfs and other utilities are in reiserfs/utils, or wherever your
Linux provider put them.  There is some disagreement about how useful
it is for users to get their fsck and mkreiserfs out of sync with the
version of reiserfs that is in their kernel, with many important
distributors wanting them out of sync.:-) Please try to remember to
recompile and reinstall fsck and mkreiserfs with every update of
reiserfs, this is a common source of confusion.  Note that some of the
utilities cannot be compiled without accessing the balancing code
which is in the kernel code, and relocating the utilities may require
you to specify where that code can be found.

Yes, if you update your reiserfs kernel module you do have to
recompile your kernel, most of the time.  The errors you get will be
quite cryptic if your forget to do so.

Real users, as opposed to folks who want to hack and then understand
what went wrong, will want REISERFS_CHECK off.

Hideous Commercial Pitch: Spread your development costs across other OS
vendors.  Select from the best in the world, not the best in your
building, by buying from third party OS component suppliers.  Leverage
the software component development power of the internet.  Be the most
aggressive in taking advantage of the commercial possibilities of
decentralized internet development, and add value through your branded
integration that you sell as an operating system.  Let your competitors
be the ones to compete against the entire internet by themselves.  Be
hip, get with the new economic trend, before your competitors do.  Send
email to hans@reiser.to.

To understand the code, after reading the website, start reading the
code by reading reiserfs_fs.h first.

Hans Reiser was the project initiator, primary architect, source of all
funding for the first 5.5 years, and one of the programmers.  He owns
the copyright.

Vladimir Saveljev was one of the programmers, and he worked long hours
writing the cleanest code.  He always made the effort to be the best he
could be, and to make his code the best that it could be.  What resulted
was quite remarkable. I don't think that money can ever motivate someone
to work the way he did, he is one of the most selfless men I know.

Yura helps with benchmarking, coding hashes, and block pre-allocation
code.

Anatoly Pinchuk is a former member of our team who worked closely with
Vladimir throughout the project's development.  He wrote a quite
substantial portion of the total code.  He realized that there was a
space problem with packing tails of files for files larger than a node
that start on a node aligned boundary (there are reasons to want to node
align files), and he invented and implemented indirect items and
unformatted nodes as the solution.

Konstantin Shvachko, with the help of the Russian version of a VC,
tried to put me in a position where I was forced into giving control
of the project to him.  (Fortunately, as the person paying the money
for all salaries from my dayjob I owned all copyrights, and you can't
really force takeovers of sole proprietorships.)  This was something
curious, because he never really understood the value of our project,
why we should do what we do, or why innovation was possible in
general, but he was sure that he ought to be controlling it.  Every
innovation had to be forced past him while he was with us.  He added
two years to the time required to complete reiserfs, and was a net
loss for me.  Mikhail Gilula was a brilliant innovator who also left
in a destructive way that erased the value of his contributions, and
that he was shown much generosity just makes it more painful.

Grigory Zaigralin was an extremely effective system administrator for
our group.

Igor Krasheninnikov was wonderful at hardware procurement, repair, and
network installation.

Jeremy Fitzhardinge wrote the teahash.c code, and he gives credit to a
textbook he got the algorithm from in the code.  Note that his analysis
of how we could use the hashing code in making 32 bit NFS cookies work
was probably more important than the actual algorithm.  Colin Plumb also
contributed to it.

Chris Mason dived right into our code, and in just a few months produced
the journaling code that dramatically increased the value of ReiserFS.
He is just an amazing programmer.

Igor Zagorovsky is writing much of the new item handler and extent code
for our next major release.

Alexander Zarochentcev (sometimes known as zam, or sasha), wrote the
resizer, and is hard at work on implementing allocate on flush.  SGI
implemented allocate on flush before us for XFS, and generously took
the time to convince me we should do it also.  They are great people,
and a great company.

Yuri Shevchuk and Nikita Danilov are doing squid cache optimization.

Vitaly Fertman is doing fsck.

Jeff Mahoney, of SuSE, contributed a few cleanup fixes, most notably
the endian safe patches which allow ReiserFS to run on any platform
supported by the Linux kernel.

SuSE, IntegratedLinux.com, Ecila, MP3.com, bigstorage.com, and the
Alpha PC Company made it possible for me to not have a day job
anymore, and to dramatically increase our staffing.  Ecila funded
hypertext feature development, MP3.com funded journaling, SuSE funded
core development, IntegratedLinux.com funded squid web cache
appliances, bigstorage.com funded HSM, and the alpha PC company funded
the alpha port.  Many of these tasks were helped by sponsors other
than the ones just named.  SuSE has helped in much more than just
funding....