gecko-dev/webtools/web-sniffer/READ-ME.txt

99 lines
3.1 KiB
Plaintext

SniffURI
by Erik van der Poel <erik@vanderpoel.org>
originally created in 1998
http://sniffuri.org/
Introduction
This is a set of tools to work with the protocols underlying the Web.
Description of Tools
view.cgi
This is an HTML form that allows the user to enter a URL. The CGI then
fetches the object associated with that URL, and presents it to the
user in a colorful way. For example, HTTP headers are shown, HTML
documents are parsed and colored, and non-ASCII characters are shown in
hex. Links are turned into live links, that can be clicked to see the
source of that URL, allowing the user to "browse" source.
robot
Originally written to see how many documents actually include the HTTP
and HTML charsets, this tool has developed into a more general robot
that collects various statistics, including HTML tag statistics, DNS
lookup timing, etc. This robot does not adhere to the standard robot
rules, so please exercise caution if you use this.
proxy
This is an HTTP proxy that sits between the user's browser and another
HTTP proxy. It captures all of the HTTP traffic between the browser and
the Internet, and presents it to the user in the same colorful way as
the above-mentioned view.cgi.
grab
Allows the user to "grab" a whole Web site, or everything under a
particular directory. This is useful if you want to grab a bunch of
related HTML files, e.g. the whole CSS2 spec.
link
Allows the user to recursively check for bad links in a Web site or
under a particular directory.
Description of Files
addurl.c, addurl.h: adds URLs to a list
all.h: header that includes all the required headers
buf.c, buf.h: I/O routines
cgiview.c: the view.cgi tool
config*: autoconf-related files
dns.c: experimental DNS toy
doRun: used with robot
file.c, file.h: the file: URL
ftp.c: experimental FTP toy
grab.c: the "grab" tool
hash.c, hash.h: incomplete hash table routines
HISTORY: brief history of this project
html.c, html.h: HTML parser
http.c, http.h: simple HTTP implementation
index.html: used with view.cgi tool
INSTALL: see this file for build instructions
link.c: the "link" tool
main.h: very simple callbacks, could be more object-oriented
Makefile, Makefile.in: to build everything
mime.c, mime.h: MIME Content-Type parser
net.c, net.h: low-level Internet APIs
pop.c: experimental POP toy
proxy.c: the "proxy" tool
robot.c: the "robot" tool
run: used with robot
test: directory used in testing
thread.c, thread.h: for threading in the robot and locks elsewhere
TODO: notes to myself
url.c, url.h, urltest.c: implementation of absolute and relative URLs
utils.c, utils.h: some little utility routines
view.c, view.h: presents stuff to the user
Description of Code
The code is extremely quick-and-dirty. It could be a lot more elegant,
e.g. C++, object-oriented, extensible, etc.
The point of this exercise was not to design and write a program well,
but to create some useful tools and to learn about Internet protocols.