bear%code-bear.com f6d5800f98 Updated info files so they read in the third person
fixed typos and grammer
changed layout to be a bit more readable
Moved release notes into a ReleaseNotes file
2005-10-08 18:57:16 +00:00

297 lines
16 KiB
Plaintext

To install Tinderbox2 you will need some information about your
existing computer systems and some idea about what your goals are.
Here is a list of topics to help get you started, some of these
ideas may not be appropriate for your environment.
Web server configuration
------------------------
The web server is what delivers the Tinderbox2 information to the world.
Web server configuration is a bit of an art and you will need to
understand the policies which are used to administer your web server.
* You will need to decide the directory where Tinderbox2 should write
the static HTML pages. This will depend on how your web server is
configured. The default location is based on the RedHat 7.1
(apache-1.3.19-5) installation and is: /var/www/html/tinderbox2.
You will also need to know what the URL browsers will need to use
to find this directory. Since Tinderbox2 generates static web
pages, it is possible to run Tinderbox2 and not run a web server.
One way this could be done is if you have a network file system
and all users have browsers which can read from the HTML directories.
In this case all URL's should begin with "file:/" instead of the
usual "http://".
* Project level administration is done via cgi scripts. These scripts
allow administrators to set the message of the day, and the state
of the tree (open, closed, restricted). Also all users can post
notices to the web pages via a cgi script.
CGI programs are often restricted to a portion of the file system
which is disjoint from the HTML files. You will need to figure out
where the CGI programs will go. Tinderbox2 takes its defaults from
RedHat 7.1 and uses /var/www/cgi-bin/tinderbox2. You will also need
to know what the URL browsers will need to use to find this directory.
* CGI scripts will run as an un-authenticated user on your system.
You will need to decide which user will run the Tinderbox2 CGI
scripts. The same user id must be used for running the scripts as
for Tinderbox2 mail delivery.
The Tinderbox2 configuration files will define this user id and,
as a security precaution, check that it is running as the required
id. It is suggested that this id not be a privileged id (higher
ids are better, please make this number be greater than 10 and
bigger than 100 is recommended). Smaller ids are often assumed to
have more privileges on a Unix box then larger ids. It is not a
good idea for an un-authenticated user to have any privileges so a
large id is recommended.
It is also recommended that you not use the id 'nobody' as this id
is over used and it would be better to partition the un-authenticated
user into separate ids in case of security problems.
RedHat runs all its CGI scripts as the user 'apache', this is an
acceptable user. I would prefer to have a separate user to run the
Tinderbox2 CGI scripts but this would require recompiling apache to
enable suEXEC, and it is more effort then most groups can afford.
* Tinderbox2 Files. There are other Tinderbox2 files which need to
be placed on the web server. These include libraries and non-cgi
programs. You will need to decide where to place these files.
Most users put them in /home/tinderbox2.
* Tinderbox2 Data. Tinderbox2 stores its data in the file system.
For security it is often a good idea to keep this data out of the
HTML and CGI directories so that malicious users can not directly
access this data. The compressed build logs can grow quite large,
so it is recommended to put the data on a file system with room.
The default is to put them in the directory /home/tinderbox2/data.
Mail
----
Many of the Tinderbox2 modules (Bug Ticket, Build, CVS) receive their
data via mail.
* The mail system on your web server machine must be configured to
deliver the mail into the Tinderbox2 mail processing programs.
You should spend some time understanding how your mail delivery
system can be configured to allow user mail to be delivered into a
program and how to set the user id under which this delivery occurs.
If you do not wish to configure your mail delivery program then you
can use fetchmail to pull the mail out of a mail box and push it into
the programs on a periodic basis. See the install page for details
on different aspects about mailing systems.
Production Version Control
-------------------------
One of the biggest responsibilities which a "build master" has is the
requirement that all code should be reproducible. That is that at
any point in the future, even more than one year later, the current
binaries should be able to be rebuilt byte for byte from sources.
This requirement can be broken down as follows:
* The build machine must be reproducible.
You must be able to get back the same build machine configuration
you had at any point in the past. This means that all OS libraries,
all header files, all compilers, all build tools (make, grep, sed)
must have some mechanism to roll back. It is common to use a backup
of the build machine to reconstruct it. Most OS will give you a
list of the software packages which are installed on the machine and
their version numbers. Maintaining a the list of software packages
which are installed on the machine and check it into version control.
This allows you to compare the state of the build machine at any two
points in time.
It is considered a best practice to limit the amount of software which
is available on the build machine. A build machine with too much
installed will only make it difficult to reproduce older builds should
the need arise. A build machine should not have installed any web
servers or graphical window managers on your build machine. It should
be clear that the build machine should not be the same machine where
the Tinderbox2 server runs.
* The build process must be reproducible. That is all the steps which
are used to create the application must be reproducible.
* Build Interface - You must be able to run exactly the same build
process in the future including all commands with command line
arguments, all environmental variables. I recommend that the entire
build process be viewed as something outside of the build master
control. Developers are responsible for ensuring that there is a
simple build master interface to construct all the software products
which go into a build. Typically there is a Makefile in a standard
place where the build master can run something like "make all; make
install;" and be guaranteed that this will build the product.
The build interface should be viewed as something which never changes
and are part of the build machine, like the OS and are changed only
rarely. It is hard enough to track all the parts of the build process
which are epected to change, let alone trying to track complex build
procedures.
The build procedures should have a standard interface. By keeping the
build instructions in one Makefile which is checked into the same
version control system as the sources; it is easy to recreate any
previous build even if the commands used to build the software
fluctuate rapidly between releases. There must be a simple interface
to construct the software which will hide all the complexity of the
actual construction.
* Build Environment - The Makefile will code all the build commands
and all the environmental variables (PATH, UMASK, LD_LIBRARY_PATH,
CLASSPATH) needed to build the software though it may rely on some
well defined command line arguments (PREFIX, CCFLAGS, JAVA_LIBS) to
make these prematurely. These command line arguments should not
change between versions of the software but should be a fixed set of
build parameters.
The parameters may be needed to specify where some files are found
on the build machine (Ideally the build machine is set up the same
as developers machines so these directories can be hard-coded into
the Makefile but often there is a need for some directories to be
specified at build time) or where files are to be created/installed
on the build machine (typically a subdirectory of /var/tmp but there
may be several builds running at once and each will need a different
directory) or what kind of build is being created.
Each part of the build which needs a particular environmental variable
set or a special header file in some path should have tests which
ensure that the build environment is valid. The build scripts could be
installed on the build machine and started by running:
"/etc/rc.d/init.d/build start". This would ensure that the build script
does not rely on any build environmental variables which are set by
logging into the build account and are thus not tracked and versioned.
* Environmental safety issues. If the build environment can not be used
to build the software then a human readable error message should be
generated. The Makefile can often run various checks on the environmen
variables before they construct the code. They check that all required
environment variables are set, that the required libraries are found,
that directories which must be disjoint (build and install directories)
do not overlap.
This test suite becomes a build regression test and as additional possible
build problems are discovered, new tests can be added to the Makefile.
It is a good habit to explicit set all environment variables so that there
is no doubt as to their expected values. It is important for the QA group
to only use Builds which were created by an automated process so that you
are sure that there are no undocumented steps in either the test builds or
the released build.
* Track the Build numbers. Given a clean install of your product you
should have all the information necessary to reproduce the executable
from sources. If a customer shows you the application binaries you
must be able to get the source code which build the application,
reconstruct the build machine which created the application and
possibly rerun the build exactly the same way as the application was
created before, this may include making some minor source code changes
before the build is run.
One method is to keep a file which contains:
- The product release name
- The sources 'as of date'. (Always checkout sources using
a known datetime or revision # (cvs -D or svn -R) so that
exactly the same sources can be recovered knowing only the
'date/time' which was used to check them out.
- The branch name
- The module name.
This can be stored as a file in the product (encrypted if necessary)
or may be stored in some secure build master database where the data
can be looked up by release name. A good practice is to keep all data
necessary to reproduce a build in the build output and delivered as
part of the product. This means that you can generate as many builds as
needed automatically and not need to keep track of any of them. When
the QA team deems that a certain build is 'important', by making a
particular build the official released copy then you can take a look at
its contents and tag/branch the code at the sources which were used to
build it.
* Build Prefix: It is a good idea to familiarize yourself with the
Makefile conventions regarding the make variable PREFIX. It is
easiest to understand if you think about what RedHat does when they
build their distribution of RPM's but this will apply in many
different systems including the Andrew File System (AFS) and most
packaging systems.
This variable is used during the build process
"make all PREFIX=/home/apache" to tell the package where it will be
installed (examples include /usr, /usr/local, /home/apache). Reading
a few RedHat Spec files to see how this works in practice. The
application may need to hard-code this value into its object code.
When the application is installed it must not be installed into its
proper place on the build machine. The package that is being
constructed could cause the build machine to stop working
correctly if it is a buggy version of a system library or major OS
application. Instead the Makefile will install "make install
PREFIX=/var/tmp/build-root/home/apache" the package into some other
directory with a similar tree structure to its final destination.
The packaging system will then move the files into the correct place
during an installation step on the target machine. The installation
step only moves files and sets permissions. The Makefile is not
supposed to use the installation directories to hard code values into
the application since the application will never be run from this
installation directory.
The hard part of the build, including any PREFIX magic, is in the
build section. Notice the clear separation between build/target
machine, installation on the build machine, installation on the target
machine, construction of the application binaries and installation of
the application binaries.
This is one of the reasons why building an application on a build
machine is different from the way in which developers build their code
on their personal development machines. This PREFIX issue will arise
when you try and build the Tinderbox2 system and also when you
construct the Makefiles for your own application. Since the build
machine is not the target machine it can not be assumed that files
will always be in the same places on both (for example Perl.)
* Application Architecture. The build process should mimic the
architecture of the code. It should be a final test that the code
was coded to the same specifications that it was designed. It is a
common problem for code to turn into spaghetti with each piece of
code using functions and creating dependencies on every other piece
of code.
For example it is probably a mistake for code in the database
abstraction layer to be implemented in terms of code in the HTML
generation layer. These two libraries should probably be independent
of each other, though they both might depend on a common string library.
The code architecture should limit the dependency graph between code
modules. The Build Master must enforce the restrictions on information
flow between components. Thus no libraries should be in the path unless
the architecture allows this module to depend on those libraries.
The architecture must not have circular dependencies. Circular
dependencies not only make upgrading individual libraries difficult
but also make testing components nearly impossible. That is it should
be possible to build some set of libraries L0 which depend on no
libraries and then build some other set of libraries L1 which depend
only on L0 libraries then build L2 which depend only on the L0 and L1
libraries. This "build chain" will prevent circular dependencies and
help keep your code testable and the dependencies understandable.
More information about why this is a good practice is available in
"Large-Scale C++ Software Design" (Addison-Wesley Professional
Computing Series) by John Lakos
You should enforce the convention that developers are not allowed to
overload standard system libraries. Always put standard libraries
in the path before any library our company develops. Build the
application in stages to ensure that parts of the application which
are not intended to depend on other code will not have other header
files on the build machine at the time that they are constructed.
Build dependencies between modules which are expected are explicitly
controlled with build scripts and version numbers.