diff --git a/ChangeLog b/ChangeLog
index 75d7082b..62f609f6 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+Wed Oct 24 14:34:25 CEST 2001 Daniel Veillard
+
+ * doc/site.xsl doc/*.html doc/Makefile.am: now autogenerate
+ the web site from the main HTML document.
+
Tue Oct 23 14:32:04 CEST 2001 Daniel Veillard
* parser.c: fixed an erroneous validation bug when PE refs
diff --git a/doc/DOM.html b/doc/DOM.html
new file mode 100644
index 00000000..b4269174
--- /dev/null
+++ b/doc/DOM.html
@@ -0,0 +1,71 @@
+
+
+
+
+
+DOM Principles
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+DOM Principles
+ |
|
|
+
+
+ |
+
+
+DOM stands for the Document
+Object Model; this is an API for accessing XML or HTML structured
+documents. Native support for DOM in Gnome is on the way (module gnome-dom),
+and will be based on gnome-xml. This will be a far cleaner interface to
+manipulate XML files within Gnome since it won't expose the internal
+structure.
+The current DOM implementation on top of libxml is the gdome2 Gnome module, this
+is a full DOM interface, thanks to Paolo Casarini, check the Gdome2 homepage for more
+informations.
+Daniel Veillard
+ |
|
|
|
+
|
+
+
diff --git a/doc/Makefile.am b/doc/Makefile.am
index b995b0ee..799a2a76 100644
--- a/doc/Makefile.am
+++ b/doc/Makefile.am
@@ -12,11 +12,18 @@ DOC_SOURCE_DIR=..
HTML_DIR=@HTML_DIR@
TARGET_DIR=$(HTML_DIR)/$(DOC_MODULE)/html
+PAGES= architecture.html bugs.html contribs.html docs.html DOM.html \
+ downloads.html entities.html example.html help.html index.html \
+ interface.html intro.html library.html namespaces.html news.html \
+ tree.html valid.html XML.html XSLT.html
man_MANS = xmlcatalog.1
-# htmldir = $(prefix)/html
-# html_DATA = gnome-dev-info.html
+all: $(PAGES)
+
+$(PAGES): xml.html site.xsl
+ @(if [ -x /usr/bin/xsltproc ] ; then \
+ /usr/bin/xsltproc --html site.xsl xml.html > index.html ; fi );
scan:
gtkdoc-scan --module=libxml --source-dir=$(DOC_SOURCE_DIR) --ignore-headers="acconfig.h config.h xmlwin32version.h win32config.h trio.h strio.h triop.h"
@@ -52,6 +59,6 @@ install-data-local:
-(cd $(DESTDIR); gtkdoc-fixxref --module=libxml --html-dir=$(HTML_DIR))
dist-hook:
- (cd $(srcdir) ; tar cvf - *.1 *.html *.gif html/*.html html/*.sgml) | (cd $(distdir); tar xf -)
+ (cd $(srcdir) ; tar cvf - *.1 site.xsl *.html *.gif html/*.html html/*.sgml) | (cd $(distdir); tar xf -)
.PHONY : html sgml templates scan
diff --git a/doc/XML.html b/doc/XML.html
new file mode 100644
index 00000000..97fc381b
--- /dev/null
+++ b/doc/XML.html
@@ -0,0 +1,90 @@
+
+
+
+
+
+XML
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+XML
+ |
|
|
+
+
+ |
+
+
+XML is a standard for
+markup-based structured documents. Here is an example XML
+document:
+<?xml version="1.0"?>
+<EXAMPLE prop1="gnome is great" prop2="& linux too">
+ <head>
+ <title>Welcome to Gnome</title>
+ </head>
+ <chapter>
+ <title>The Linux adventure</title>
+ <p>bla bla bla ...</p>
+ <image href="linus.gif"/>
+ <p>...</p>
+ </chapter>
+</EXAMPLE>
+The first line specifies that it's an XML document and gives useful
+information about its encoding. Then the document is a text format whose
+structure is specified by tags between brackets. Each tag opened has
+to be closed. XML is pedantic about this. However, if a tag is empty
+(no content), a single tag can serve as both the opening and closing tag if
+it ends with /> rather than with > . Note
+that, for example, the image tag has no content (just an attribute) and is
+closed by ending the tag with /> .
+XML can be applied sucessfully to a wide range of uses, from long term
+structured document maintenance (where it follows the steps of SGML) to
+simple data encoding mechanisms like configuration file formatting (glade),
+spreadsheets (gnumeric), or even shorter lived documents such as WebDAV where
+it is used to encode remote calls between a client and a server.
+Daniel Veillard
+ |
|
|
|
+
|
+
+
diff --git a/doc/XSLT.html b/doc/XSLT.html
new file mode 100644
index 00000000..294384aa
--- /dev/null
+++ b/doc/XSLT.html
@@ -0,0 +1,72 @@
+
+
+
+
+
+XSLT
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+XSLT
+ |
|
|
+
+
+ |
+
+ Check the separate libxslt page
+
+
+XSL Transformations, is a
+language for transforming XML documents into other XML documents (or
+HTML/textual output).
+A separate library called libxslt is being built on top of libxml2. This
+module "libxslt" can be found in the Gnome CVS base too.
+You can check the features
+supported and the progresses on the Changelog
+
+Daniel Veillard
+ |
|
|
|
+
|
+
+
diff --git a/doc/architecture.html b/doc/architecture.html
new file mode 100644
index 00000000..0a31a4ec
--- /dev/null
+++ b/doc/architecture.html
@@ -0,0 +1,80 @@
+
+
+
+
+
+An overview of libxml architecture
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+An overview of libxml architecture
+ |
|
|
+
+
+ |
+
+ Libxml is made of multiple components; some of them are optional, and most
+of the block interfaces are public. The main components are:
+
+- an Input/Output layer
+- FTP and HTTP client layers (optional)
+- an Internationalization layer managing the encodings support
+- a URI module
+- the XML parser and its basic SAX interface
+- an HTML parser using the same SAX interface (optional)
+- a SAX tree module to build an in-memory DOM representation
+- a tree module to manipulate the DOM representation
+- a validation module using the DOM representation (optional)
+- an XPath module for global lookup in a DOM representation
+ (optional)
+- a debug module (optional)
+
+Graphically this gives the following:
+
+
+ Daniel Veillard
+ |
|
|
|
+
|
+
+
diff --git a/doc/bugs.html b/doc/bugs.html
new file mode 100644
index 00000000..ca7fe707
--- /dev/null
+++ b/doc/bugs.html
@@ -0,0 +1,101 @@
+
+
+
+
+
+Reporting bugs and getting help
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+Reporting bugs and getting help
+ |
|
|
+
+
+ |
+
+ Well, bugs or missing features are always possible, and I will make a
+point of fixing them in a timely fashion. The best way to report a bug is to
+use the Gnome
+bug tracking database (make sure to use the "libxml" module name). I look
+at reports there regularly and it's good to have a reminder when a bug is
+still open. Check the instructions on
+reporting bugs and be sure to specify that the bug is for the package
+libxml.
+There is also a mailing-list xml@gnome.org for libxml, with an on-line archive (old). To subscribe to this list,
+please visit the associated Web page and
+follow the instructions. Do not send code, I won't debug it
+(but patches are really appreciated!).
+Check the following before
+posting:
+
+- read the FAQ
+
+- make sure you are using a recent
+ version, and that the problem still shows up in those
+- check the list
+ archives to see if the problem was reported already, in this case
+ there is probably a fix available, similary check the registered
+ open bugs
+
+- make sure you can reproduce the bug with xmllint or one of the test
+ programs found in source in the distribution
+- Please send the command showing the error as well as the input (as an
+ attachement)
+
+Then send the bug with associated informations to reproduce it to the xml@gnome.org list; if it's really libxml
+related I will approve it.. Please do not send me mail directly, it makes
+things really harder to track and in some cases I'm not the best person to
+answer a given question, ask the list instead.
+Of course, bugs reported with a suggested patch for fixing them will
+probably be processed faster.
+If you're looking for help, a quick look at the list archive may actually
+provide the answer, I usually send source samples when answering libxml usage
+questions. The auto-generated
+documentantion is not as polished as I would like (i need to learn more
+about Docbook), but it's a good starting point.
+Daniel Veillard
+ |
|
|
|
+
|
+
+
diff --git a/doc/contribs.html b/doc/contribs.html
new file mode 100644
index 00000000..bace2108
--- /dev/null
+++ b/doc/contribs.html
@@ -0,0 +1,107 @@
+
+
+
+
+
+Contributions
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+Contributions
+ |
|
|
+
+
+
+
diff --git a/doc/docs.html b/doc/docs.html
new file mode 100644
index 00000000..085cfacd
--- /dev/null
+++ b/doc/docs.html
@@ -0,0 +1,88 @@
+
+
+
+
+
+Documentation
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+Documentation
+ |
|
|
+
+
+
+
diff --git a/doc/downloads.html b/doc/downloads.html
new file mode 100644
index 00000000..98e8d8c7
--- /dev/null
+++ b/doc/downloads.html
@@ -0,0 +1,88 @@
+
+
+
+
+
+Downloads
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+Downloads
+ |
|
|
+
+
+
+
diff --git a/doc/entities.html b/doc/entities.html
new file mode 100644
index 00000000..f5ee99d6
--- /dev/null
+++ b/doc/entities.html
@@ -0,0 +1,127 @@
+
+
+
+
+
+Entities or no entities
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+Entities or no entities
+ |
|
|
+
+
+ |
+
+ Entities in principle are similar to simple C macros. An entity defines an
+abbreviation for a given string that you can reuse many times throughout the
+content of your document. Entities are especially useful when a given string
+may occur frequently within a document, or to confine the change needed to a
+document to a restricted area in the internal subset of the document (at the
+beginning). Example:
+1 <?xml version="1.0"?>
+2 <!DOCTYPE EXAMPLE SYSTEM "example.dtd" [
+3 <!ENTITY xml "Extensible Markup Language">
+4 ]>
+5 <EXAMPLE>
+6 &xml;
+7 </EXAMPLE>
+Line 3 declares the xml entity. Line 6 uses the xml entity, by prefixing
+its name with '&' and following it by ';' without any spaces added. There
+are 5 predefined entities in libxml allowing you to escape charaters with
+predefined meaning in some parts of the xml document content:
+< for the character '<', >
+for the character '>', ' for the character ''',
+" for the character '"', and
+& for the character '&'.
+One of the problems related to entities is that you may want the parser to
+substitute an entity's content so that you can see the replacement text in
+your application. Or you may prefer to keep entity references as such in the
+content to be able to save the document back without losing this usually
+precious information (if the user went through the pain of explicitly
+defining entities, he may have a a rather negative attitude if you blindly
+susbtitute them as saving time). The xmlSubstituteEntitiesDefault()
+function allows you to check and change the behaviour, which is to not
+substitute entities by default.
+Here is the DOM tree built by libxml for the previous document in the
+default case:
+/gnome/src/gnome-xml -> ./xmllint --debug test/ent1
+DOCUMENT
+version=1.0
+ ELEMENT EXAMPLE
+ TEXT
+ content=
+ ENTITY_REF
+ INTERNAL_GENERAL_ENTITY xml
+ content=Extensible Markup Language
+ TEXT
+ content=
+And here is the result when substituting entities:
+/gnome/src/gnome-xml -> ./tester --debug --noent test/ent1
+DOCUMENT
+version=1.0
+ ELEMENT EXAMPLE
+ TEXT
+ content= Extensible Markup Language
+So, entities or no entities? Basically, it depends on your use case. I
+suggest that you keep the non-substituting default behaviour and avoid using
+entities in your XML document or data if you are not willing to handle the
+entity references elements in the DOM tree.
+Note that at save time libxml enforces the conversion of the predefined
+entities where necessary to prevent well-formedness problems, and will also
+transparently replace those with chars (i.e. it will not generate entity
+reference elements in the DOM tree or call the reference() SAX callback when
+finding them in the input).
+
+WARNING: handling entities
+on top of the libxml SAX interface is difficult!!! If you plan to use
+non-predefined entities in your documents, then the learning cuvre to handle
+then using the SAX API may be long. If you plan to use complex documents, I
+strongly suggest you consider using the DOM interface instead and let libxml
+deal with the complexity rather than trying to do it yourself.
+Daniel Veillard
+ |
|
|
|
+
|
+
+
diff --git a/doc/example.html b/doc/example.html
new file mode 100644
index 00000000..db361109
--- /dev/null
+++ b/doc/example.html
@@ -0,0 +1,249 @@
+
+
+
+
+
+A real example
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+A real example
+ |
|
|
+
+
+ |
+
+ Here is a real size example, where the actual content of the application
+data is not kept in the DOM tree but uses internal structures. It is based on
+a proposal to keep a database of jobs related to Gnome, with an XML based
+storage structure. Here is an XML encoded jobs
+base:
+<?xml version="1.0"?>
+<gjob:Helping xmlns:gjob="http://www.gnome.org/some-location">
+ <gjob:Jobs>
+
+ <gjob:Job>
+ <gjob:Project ID="3"/>
+ <gjob:Application>GBackup</gjob:Application>
+ <gjob:Category>Development</gjob:Category>
+
+ <gjob:Update>
+ <gjob:Status>Open</gjob:Status>
+ <gjob:Modified>Mon, 07 Jun 1999 20:27:45 -0400 MET DST</gjob:Modified>
+ <gjob:Salary>USD 0.00</gjob:Salary>
+ </gjob:Update>
+
+ <gjob:Developers>
+ <gjob:Developer>
+ </gjob:Developer>
+ </gjob:Developers>
+
+ <gjob:Contact>
+ <gjob:Person>Nathan Clemons</gjob:Person>
+ <gjob:Email>nathan@windsofstorm.net</gjob:Email>
+ <gjob:Company>
+ </gjob:Company>
+ <gjob:Organisation>
+ </gjob:Organisation>
+ <gjob:Webpage>
+ </gjob:Webpage>
+ <gjob:Snailmail>
+ </gjob:Snailmail>
+ <gjob:Phone>
+ </gjob:Phone>
+ </gjob:Contact>
+
+ <gjob:Requirements>
+ The program should be released as free software, under the GPL.
+ </gjob:Requirements>
+
+ <gjob:Skills>
+ </gjob:Skills>
+
+ <gjob:Details>
+ A GNOME based system that will allow a superuser to configure
+ compressed and uncompressed files and/or file systems to be backed
+ up with a supported media in the system. This should be able to
+ perform via find commands generating a list of files that are passed
+ to tar, dd, cpio, cp, gzip, etc., to be directed to the tape machine
+ or via operations performed on the filesystem itself. Email
+ notification and GUI status display very important.
+ </gjob:Details>
+
+ </gjob:Job>
+
+ </gjob:Jobs>
+</gjob:Helping>
+While loading the XML file into an internal DOM tree is a matter of
+calling only a couple of functions, browsing the tree to gather the ata and
+generate the internal structures is harder, and more error prone.
+The suggested principle is to be tolerant with respect to the input
+structure. For example, the ordering of the attributes is not significant,
+the XML specification is clear about it. It's also usually a good idea not to
+depend on the order of the children of a given node, unless it really makes
+things harder. Here is some code to parse the information for a person:
+/*
+ * A person record
+ */
+typedef struct person {
+ char *name;
+ char *email;
+ char *company;
+ char *organisation;
+ char *smail;
+ char *webPage;
+ char *phone;
+} person, *personPtr;
+
+/*
+ * And the code needed to parse it
+ */
+personPtr parsePerson(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
+ personPtr ret = NULL;
+
+DEBUG("parsePerson\n");
+ /*
+ * allocate the struct
+ */
+ ret = (personPtr) malloc(sizeof(person));
+ if (ret == NULL) {
+ fprintf(stderr,"out of memory\n");
+ return(NULL);
+ }
+ memset(ret, 0, sizeof(person));
+
+ /* We don't care what the top level element name is */
+ cur = cur->xmlChildrenNode;
+ while (cur != NULL) {
+ if ((!strcmp(cur->name, "Person")) && (cur->ns == ns))
+ ret->name = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1);
+ if ((!strcmp(cur->name, "Email")) && (cur->ns == ns))
+ ret->email = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1);
+ cur = cur->next;
+ }
+
+ return(ret);
+}
+Here are a couple of things to notice:
+
+- Usually a recursive parsing style is the more convenient one: XML data
+ is by nature subject to repetitive constructs and usually exibits highly
+ stuctured patterns.
+- The two arguments of type xmlDocPtr and xmlNsPtr,
+ i.e. the pointer to the global XML document and the namespace reserved to
+ the application. Document wide information are needed for example to
+ decode entities and it's a good coding practice to define a namespace for
+ your application set of data and test that the element and attributes
+ you're analyzing actually pertains to your application space. This is
+ done by a simple equality test (cur->ns == ns).
+- To retrieve text and attributes value, you can use the function
+ xmlNodeListGetString to gather all the text and entity reference
+ nodes generated by the DOM output and produce an single text string.
+
+Here is another piece of code used to parse another level of the
+structure:
+#include <libxml/tree.h>
+/*
+ * a Description for a Job
+ */
+typedef struct job {
+ char *projectID;
+ char *application;
+ char *category;
+ personPtr contact;
+ int nbDevelopers;
+ personPtr developers[100]; /* using dynamic alloc is left as an exercise */
+} job, *jobPtr;
+
+/*
+ * And the code needed to parse it
+ */
+jobPtr parseJob(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
+ jobPtr ret = NULL;
+
+DEBUG("parseJob\n");
+ /*
+ * allocate the struct
+ */
+ ret = (jobPtr) malloc(sizeof(job));
+ if (ret == NULL) {
+ fprintf(stderr,"out of memory\n");
+ return(NULL);
+ }
+ memset(ret, 0, sizeof(job));
+
+ /* We don't care what the top level element name is */
+ cur = cur->xmlChildrenNode;
+ while (cur != NULL) {
+
+ if ((!strcmp(cur->name, "Project")) && (cur->ns == ns)) {
+ ret->projectID = xmlGetProp(cur, "ID");
+ if (ret->projectID == NULL) {
+ fprintf(stderr, "Project has no ID\n");
+ }
+ }
+ if ((!strcmp(cur->name, "Application")) && (cur->ns == ns))
+ ret->application = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1);
+ if ((!strcmp(cur->name, "Category")) && (cur->ns == ns))
+ ret->category = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1);
+ if ((!strcmp(cur->name, "Contact")) && (cur->ns == ns))
+ ret->contact = parsePerson(doc, ns, cur);
+ cur = cur->next;
+ }
+
+ return(ret);
+}
+Once you are used to it, writing this kind of code is quite simple, but
+boring. Ultimately, it could be possble to write stubbers taking either C
+data structure definitions, a set of XML examples or an XML DTD and produce
+the code needed to import and export the content between C data and XML
+storage. This is left as an exercise to the reader :-)
+Feel free to use the code for the full C
+parsing example as a template, it is also available with Makefile in the
+Gnome CVS base under gnome-xml/example
+Daniel Veillard
+ |
|
|
|
+
|
+
+
diff --git a/doc/help.html b/doc/help.html
new file mode 100644
index 00000000..61b98696
--- /dev/null
+++ b/doc/help.html
@@ -0,0 +1,78 @@
+
+
+
+
+
+How to help
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+How to help
+ |
|
|
+
+
+ |
+
+ You can help the project in various ways, the best thing to do first is to
+subscribe to the mailing-list as explained before, check the archives and the Gnome bug
+database::
+
+- provide patches when you find problems
+- provide the diffs when you port libxml to a new platform. They may not
+ be integrated in all cases but help pinpointing portability problems
+ and
+- provide documentation fixes (either as patches to the code comments or
+ as HTML diffs).
+- provide new documentations pieces (translations, examples, etc ...)
+- Check the TODO file and try to close one of the items
+- take one of the points raised in the archive or the bug database and
+ provide a fix. Get in touch with me
+ before to avoid synchronization problems and check that the suggested
+ fix will fit in nicely :-)
+
+Daniel Veillard
+ |
|
|
|
+
|
+
+
diff --git a/doc/index.html b/doc/index.html
new file mode 100644
index 00000000..c4c230a2
--- /dev/null
+++ b/doc/index.html
@@ -0,0 +1,95 @@
+
+
+
+
+
+The XML C library for Gnome
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+libxml
+ |
|
|
+
+
+
+
diff --git a/doc/interface.html b/doc/interface.html
new file mode 100644
index 00000000..6d1a0e66
--- /dev/null
+++ b/doc/interface.html
@@ -0,0 +1,115 @@
+
+
+
+
+
+The SAX interface
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+The SAX interface
+ |
|
|
+
+
+ |
+
+ Sometimes the DOM tree output is just too large to fit reasonably into
+memory. In that case (and if you don't expect to save back the XML document
+loaded using libxml), it's better to use the SAX interface of libxml. SAX is
+a callback-based interface to the parser. Before parsing,
+the application layer registers a customized set of callbacks which are
+called by the library as it progresses through the XML input.
+To get more detailed step-by-step guidance on using the SAX interface of
+libxml, see the nice
+documentation.written by James
+Henstridge.
+You can debug the SAX behaviour by using the testSAX
+program located in the gnome-xml module (it's usually not shipped in the
+binary packages of libxml, but you can find it in the tar source
+distribution). Here is the sequence of callbacks that would be reported by
+testSAX when parsing the example XML document shown earlier:
+SAX.setDocumentLocator()
+SAX.startDocument()
+SAX.getEntity(amp)
+SAX.startElement(EXAMPLE, prop1='gnome is great', prop2='& linux too')
+SAX.characters( , 3)
+SAX.startElement(head)
+SAX.characters( , 4)
+SAX.startElement(title)
+SAX.characters(Welcome to Gnome, 16)
+SAX.endElement(title)
+SAX.characters( , 3)
+SAX.endElement(head)
+SAX.characters( , 3)
+SAX.startElement(chapter)
+SAX.characters( , 4)
+SAX.startElement(title)
+SAX.characters(The Linux adventure, 19)
+SAX.endElement(title)
+SAX.characters( , 4)
+SAX.startElement(p)
+SAX.characters(bla bla bla ..., 15)
+SAX.endElement(p)
+SAX.characters( , 4)
+SAX.startElement(image, href='linus.gif')
+SAX.endElement(image)
+SAX.characters( , 4)
+SAX.startElement(p)
+SAX.characters(..., 3)
+SAX.endElement(p)
+SAX.characters( , 3)
+SAX.endElement(chapter)
+SAX.characters( , 1)
+SAX.endElement(EXAMPLE)
+SAX.endDocument()
+Most of the other interfaces of libxml are based on the DOM tree-building
+facility, so nearly everything up to the end of this document presupposes the
+use of the standard DOM tree build. Note that the DOM tree itself is built by
+a set of registered default callbacks, without internal specific
+interface.
+Daniel Veillard
+ |
|
|
|
+
|
+
+
diff --git a/doc/intro.html b/doc/intro.html
new file mode 100644
index 00000000..2a343d3b
--- /dev/null
+++ b/doc/intro.html
@@ -0,0 +1,87 @@
+
+
+
+
+
+Introduction
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+Introduction
+ |
|
|
+
+
+ |
+
+ This document describes libxml, the XML C library developped for the Gnome project. XML is a standard for building tag-based
+structured documents/data.
+Here are some key points about libxml:
+
+- Libxml exports Push and Pull type parser interfaces for both XML and
+ HTML.
+- Libxml can do DTD validation at parse time, using a parsed document
+ instance, or with an arbitrary DTD.
+- Libxml now includes nearly complete XPath, XPointer and XInclude implementations.
+- It is written in plain C, making as few assumptions as possible, and
+ sticking closely to ANSI C/POSIX for easy embedding. Works on
+ Linux/Unix/Windows, ported to a number of other platforms.
+- Basic support for HTTP and FTP client allowing aplications to fetch
+ remote resources
+- The design is modular, most of the extensions can be compiled out.
+- The internal document repesentation is as close as possible to the DOM interfaces.
+- Libxml also has a SAX
+ like interface; the interface is designed to be compatible with Expat.
+- This library is released both under the W3C
+ IPR and the GNU
+ LGPL. Use either at your convenience, basically this should make
+ everybody happy, if not, drop me a mail.
+
+Warning: unless you are forced to because your application links with a
+Gnome library requiring it, Do Not Use libxml1, use
+libxml2
+Daniel Veillard
+ |
|
|
|
+
|
+
+
diff --git a/doc/library.html b/doc/library.html
new file mode 100644
index 00000000..289c7f51
--- /dev/null
+++ b/doc/library.html
@@ -0,0 +1,241 @@
+
+
+
+
+
+The XML library interfaces
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+The XML library interfaces
+ |
|
|
+
+
+ |
+
+ This section is directly intended to help programmers getting bootstrapped
+using the XML library from the C language. It is not intended to be
+extensive. I hope the automatically generated documents will provide the
+completeness required, but as a separate set of documents. The interfaces of
+the XML library are by principle low level, there is nearly zero abstraction.
+Those interested in a higher level API should look at
+DOM.
+The parser interfaces for XML are
+separated from the HTML parser
+interfaces. Let's have a look at how the XML parser can be called:
+
+Usually, the first thing to do is to read an XML input. The parser accepts
+documents either from in-memory strings or from files. The functions are
+defined in "parser.h":
+
+xmlDocPtr xmlParseMemory(char *buffer, int size);
+Parse a null-terminated string containing the document.
+
+
+xmlDocPtr xmlParseFile(const char *filename);
+Parse an XML document contained in a (possibly compressed)
+ file.
+
+The parser returns a pointer to the document structure (or NULL in case of
+failure).
+Invoking the parser: the push method
+In order for the application to keep the control when the document is
+being fetched (which is common for GUI based programs) libxml provides a push
+interface, too, as of version 1.8.3. Here are the interface functions:
+xmlParserCtxtPtr xmlCreatePushParserCtxt(xmlSAXHandlerPtr sax,
+ void *user_data,
+ const char *chunk,
+ int size,
+ const char *filename);
+int xmlParseChunk (xmlParserCtxtPtr ctxt,
+ const char *chunk,
+ int size,
+ int terminate);
+and here is a simple example showing how to use the interface:
+ FILE *f;
+
+ f = fopen(filename, "r");
+ if (f != NULL) {
+ int res, size = 1024;
+ char chars[1024];
+ xmlParserCtxtPtr ctxt;
+
+ res = fread(chars, 1, 4, f);
+ if (res > 0) {
+ ctxt = xmlCreatePushParserCtxt(NULL, NULL,
+ chars, res, filename);
+ while ((res = fread(chars, 1, size, f)) > 0) {
+ xmlParseChunk(ctxt, chars, res, 0);
+ }
+ xmlParseChunk(ctxt, chars, 0, 1);
+ doc = ctxt->myDoc;
+ xmlFreeParserCtxt(ctxt);
+ }
+ }
+The HTML parser embedded into libxml also has a push interface; the
+functions are just prefixed by "html" rather than "xml".
+Invoking the parser: the SAX interface
+The tree-building interface makes the parser memory-hungry, first loading
+the document in memory and then building the tree itself. Reading a document
+without building the tree is possible using the SAX interfaces (see SAX.h and
+James
+Henstridge's documentation). Note also that the push interface can be
+limited to SAX: just use the two first arguments of
+xmlCreatePushParserCtxt() .
+
+The other way to get an XML tree in memory is by building it. Basically
+there is a set of functions dedicated to building new elements. (These are
+also described in <libxml/tree.h>.) For example, here is a piece of
+code that produces the XML document used in the previous examples:
+ #include <libxml/tree.h>
+ xmlDocPtr doc;
+ xmlNodePtr tree, subtree;
+
+ doc = xmlNewDoc("1.0");
+ doc->children = xmlNewDocNode(doc, NULL, "EXAMPLE", NULL);
+ xmlSetProp(doc->children, "prop1", "gnome is great");
+ xmlSetProp(doc->children, "prop2", "& linux too");
+ tree = xmlNewChild(doc->children, NULL, "head", NULL);
+ subtree = xmlNewChild(tree, NULL, "title", "Welcome to Gnome");
+ tree = xmlNewChild(doc->children, NULL, "chapter", NULL);
+ subtree = xmlNewChild(tree, NULL, "title", "The Linux adventure");
+ subtree = xmlNewChild(tree, NULL, "p", "bla bla bla ...");
+ subtree = xmlNewChild(tree, NULL, "image", NULL);
+ xmlSetProp(subtree, "href", "linus.gif");
+Not really rocket science ...
+
+Basically by including "tree.h" your
+code has access to the internal structure of all the elements of the tree.
+The names should be somewhat simple like parent,
+children, next, prev,
+properties, etc... For example, still with the previous
+example:
+doc->children->children->children
+points to the title element,
+doc->children->children->next->children->children
+points to the text node containing the chapter title "The Linux
+adventure".
+
+NOTE: XML allows PIs and comments to be
+present before the document root, so doc->children may point
+to an element which is not the document Root Element; a function
+xmlDocGetRootElement() was added for this purpose.
+
+Functions are provided for reading and writing the document content. Here
+is an excerpt from the tree API:
+
+xmlAttrPtr xmlSetProp(xmlNodePtr node, const xmlChar *name, const
+ xmlChar *value);
+This sets (or changes) an attribute carried by an ELEMENT node.
+ The value can be NULL.
+
+
+const xmlChar *xmlGetProp(xmlNodePtr node, const xmlChar
+ *name);
+This function returns a pointer to new copy of the property
+ content. Note that the user must deallocate the result.
+
+Two functions are provided for reading and writing the text associated
+with elements:
+
+xmlNodePtr xmlStringGetNodeList(xmlDocPtr doc, const xmlChar
+ *value);
+This function takes an "external" string and converts it to one
+ text node or possibly to a list of entity and text nodes. All
+ non-predefined entity references like &Gnome; will be stored
+ internally as entity nodes, hence the result of the function may not be
+ a single node.
+
+
+xmlChar *xmlNodeListGetString(xmlDocPtr doc, xmlNodePtr list, int
+ inLine);
+This function is the inverse of
+ xmlStringGetNodeList() . It generates a new string
+ containing the content of the text and entity nodes. Note the extra
+ argument inLine. If this argument is set to 1, the function will expand
+ entity references. For example, instead of returning the &Gnome;
+ XML encoding in the string, it will substitute it with its value (say,
+ "GNU Network Object Model Environment").
+
+
+Basically 3 options are possible:
+
+void xmlDocDumpMemory(xmlDocPtr cur, xmlChar**mem, int
+ *size);
+Returns a buffer into which the document has been saved.
+
+
+extern void xmlDocDump(FILE *f, xmlDocPtr doc);
+Dumps a document to an open file descriptor.
+
+
+int xmlSaveFile(const char *filename, xmlDocPtr cur);
+Saves the document to a file. In this case, the compression
+ interface is triggered if it has been turned on.
+
+
+The library transparently handles compression when doing file-based
+accesses. The level of compression on saves can be turned on either globally
+or individually for one file:
+
+int xmlGetDocCompressMode (xmlDocPtr doc);
+Gets the document compression ratio (0-9).
+
+
+void xmlSetDocCompressMode (xmlDocPtr doc, int mode);
+Sets the document compression ratio.
+
+
+int xmlGetCompressMode(void);
+Gets the default compression ratio.
+
+
+void xmlSetCompressMode(int mode);
+Sets the default compression ratio.
+
+Daniel Veillard
+ |
|
|
|
+
|
+
+
diff --git a/doc/namespaces.html b/doc/namespaces.html
new file mode 100644
index 00000000..d975f3fe
--- /dev/null
+++ b/doc/namespaces.html
@@ -0,0 +1,104 @@
+
+
+
+
+
+Namespaces
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+Namespaces
+ |
|
|
+
+
+ |
+
+ The libxml library implements XML namespaces support by
+recognizing namespace contructs in the input, and does namespace lookup
+automatically when building the DOM tree. A namespace declaration is
+associated with an in-memory structure and all elements or attributes within
+that namespace point to it. Hence testing the namespace is a simple and fast
+equality operation at the user level.
+I suggest that people using libxml use a namespace, and declare it in the
+root element of their document as the default namespace. Then they don't need
+to use the prefix in the content but we will have a basis for future semantic
+refinement and merging of data from different sources. This doesn't increase
+the size of the XML output significantly, but significantly increases its
+value in the long-term. Example:
+<mydoc xmlns="http://mydoc.example.org/schemas/">
+ <elem1>...</elem1>
+ <elem2>...</elem2>
+</mydoc>
+The namespace value has to be an absolute URL, but the URL doesn't have to
+point to any existing resource on the Web. It will bind all the element and
+atributes with that URL. I suggest to use an URL within a domain you control,
+and that the URL should contain some kind of version information if possible.
+For example, "http://www.gnome.org/gnumeric/1.0/" is a good
+namespace scheme.
+Then when you load a file, make sure that a namespace carrying the
+version-independent prefix is installed on the root element of your document,
+and if the version information don't match something you know, warn the user
+and be liberal in what you accept as the input. Also do *not* try to base
+namespace checking on the prefix value. <foo:text> may be exactly the
+same as <bar:text> in another document. What really matters is the URI
+associated with the element or the attribute, not the prefix string (which is
+just a shortcut for the full URI). In libxml, element and attributes have an
+ns field pointing to an xmlNs structure detailing the namespace
+prefix and its URI.
+@@Interfaces@@
+@@Examples@@
+Usually people object to using namespaces together with validity checking.
+I will try to make sure that using namespaces won't break validity checking,
+so even if you plan to use or currently are using validation I strongly
+suggest adding namespaces to your document. A default namespace scheme
+xmlns="http://...." should not break validity even on less
+flexible parsers. Using namespaces to mix and differentiate content coming
+from multiple DTDs will certainly break current validation schemes. I will
+try to provide ways to do this, but this may not be portable or
+standardized.
+Daniel Veillard
+ |
|
|
|
+
|
+
+
diff --git a/doc/news.html b/doc/news.html
new file mode 100644
index 00000000..a18b8ef1
--- /dev/null
+++ b/doc/news.html
@@ -0,0 +1,608 @@
+
+
+
+
+
+News
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+News
+ |
|
|
+
+
+ |
+
+CVS only : check the Changelog file
+for a really accurate description
+Items floating around but not actively worked on, get in touch with me if
+you want to test those
+
+- Implementing XSLT, this is done
+ as a separate C library on top of libxml called libxslt
+- Finishing up XPointer and XInclude
+
+- (seeems working but delayed from release) parsing/import of Docbook
+ SGML docs
+
+2.4.6: Oct 10 2001
+
+- added and updated man pages by John Fleck
+- portability and configure fixes
+- an infinite loop on the HTML parser was removed (William)
+- Windows makefile patches from Igor
+- fixed half a dozen bugs reported fof libxml or libxslt
+- updated xmlcatalog to be able to modify SGML super catalogs
+
+2.4.5: Sep 14 2001
+
+- Remove a few annoying bugs in 2.4.4
+- forces the HTML serializer to output decimal charrefs since some
+ version of Netscape can't handle hexadecimal ones
+
+1.8.16: Sep 14 2001
+- maintenance release of the old libxml1 branch, couple of bug and
+ portability fixes
+2.4.4: Sep 12 2001
+
+- added --convert to xmlcatalog, bug fixes and cleanups of XML
+ Catalog
+- a few bug fixes and some portability changes
+- some documentation cleanups
+
+2.4.3: Aug 23 2001
+
+- XML Catalog support see the doc
+- New NaN/Infinity floating point code
+- A few bug fixes
+
+2.4.2: Aug 15 2001
+
+- adds xmlLineNumbersDefault() to control line number generation
+- lot of bug fixes
+- the Microsoft MSC projects files shuld now be up to date
+- inheritance of namespaces from DTD defaulted attributes
+- fixes a serious potential security bug
+- added a --format option to xmllint
+
+2.4.1: July 24 2001
+
+- possibility to keep line numbers in the tree
+- some computation NaN fixes
+- extension of the XPath API
+- cleanup for alpha and ia64 targets
+- patch to allow saving through HTTP PUT or POST
+
+2.4.0: July 10 2001
+
+- Fixed a few bugs in XPath, validation, and tree handling.
+- Fixed XML Base implementation, added a coupel of examples to the
+ regression tests
+- A bit of cleanup
+
+2.3.14: July 5 2001
+
+- fixed some entities problems and reduce mem requirement when
+ substituing them
+- lots of improvements in the XPath queries interpreter can be
+ substancially faster
+- Makefiles and configure cleanups
+- Fixes to XPath variable eval, and compare on empty node set
+- HTML tag closing bug fixed
+- Fixed an URI reference computating problem when validating
+
+2.3.13: June 28 2001
+
+- 2.3.12 configure.in was broken as well as the push mode XML parser
+- a few more fixes for compilation on Windows MSC by Yon Derek
+
+1.8.14: June 28 2001
+
+- Zbigniew Chyla gave a patch to use the old XML parser in push mode
+- Small Makefile fix
+
+2.3.12: June 26 2001
+
+- lots of cleanup
+- a couple of validation fix
+- fixed line number counting
+- fixed serious problems in the XInclude processing
+- added support for UTF8 BOM at beginning of entities
+- fixed a strange gcc optimizer bugs in xpath handling of float, gcc-3.0
+ miscompile uri.c (William), Thomas Leitner provided a fix for the
+ optimizer on Tru64
+- incorporated Yon Derek and Igor Zlatkovic fixes and improvements for
+ compilation on Windows MSC
+- update of libxml-doc.el (Felix Natter)
+- fixed 2 bugs in URI normalization code
+
+2.3.11: June 17 2001
+
+- updates to trio, Makefiles and configure should fix some portability
+ problems (alpha)
+- fixed some HTML serialization problems (pre, script, and block/inline
+ handling), added encoding aware APIs, cleanup of this code
+- added xmlHasNsProp()
+- implemented a specific PI for encoding support in the DocBook SGML
+ parser
+- some XPath fixes (-Infinity, / as a function parameter and namespaces
+ node selection)
+- fixed a performance problem and an error in the validation code
+- fixed XInclude routine to implement the recursive behaviour
+- fixed xmlFreeNode problem when libxml is included statically twice
+- added --version to xmllint for bug reports
+
+2.3.10: June 1 2001
+
+- fixed the SGML catalog support
+- a number of reported bugs got fixed, in XPath, iconv detection,
+ XInclude processing
+- XPath string function should now handle unicode correctly
+
+2.3.9: May 19 2001
+Lots of bugfixes, and added a basic SGML catalog support:
+
+- HTML push bugfix #54891 and another patch from Jonas Borgström
+- some serious speed optimisation again
+- some documentation cleanups
+- trying to get better linking on solaris (-R)
+- XPath API cleanup from Thomas Broyer
+- Validation bug fixed #54631, added a patch from Gary Pennington, fixed
+ xmlValidGetValidElements()
+- Added an INSTALL file
+- Attribute removal added to API: #54433
+- added a basic support for SGML catalogs
+- fixed xmlKeepBlanksDefault(0) API
+- bugfix in xmlNodeGetLang()
+- fixed a small configure portability problem
+- fixed an inversion of SYSTEM and PUBLIC identifier in HTML document
+
+1.8.13: May 14 2001
+- bugfixes release of the old libxml1 branch used by Gnome
+2.3.8: May 3 2001
+
+- Integrated an SGML DocBook parser for the Gnome project
+- Fixed a few things in the HTML parser
+- Fixed some XPath bugs raised by XSLT use, tried to fix the floating
+ point portability issue
+- Speed improvement (8M/s for SAX, 3M/s for DOM, 1.5M/s for
+ DOM+validation using the XML REC as input and a 700MHz celeron).
+- incorporated more Windows cleanup
+- added xmlSaveFormatFile()
+- fixed problems in copying nodes with entities references (gdome)
+- removed some troubles surrounding the new validation module
+
+2.3.7: April 22 2001
+
+- lots of small bug fixes, corrected XPointer
+- Non determinist content model validation support
+- added xmlDocCopyNode for gdome2
+- revamped the way the HTML parser handles end of tags
+- XPath: corrctions of namespacessupport and number formatting
+- Windows: Igor Zlatkovic patches for MSC compilation
+- HTML ouput fixes from P C Chow and William M. Brack
+- Improved validation speed sensible for DocBook
+- fixed a big bug with ID declared in external parsed entities
+- portability fixes, update of Trio from Bjorn Reese
+
+2.3.6: April 8 2001
+
+- Code cleanup using extreme gcc compiler warning options, found and
+ cleared half a dozen potential problem
+- the Eazel team found an XML parser bug
+- cleaned up the user of some of the string formatting function. used the
+ trio library code to provide the one needed when the platform is missing
+ them
+- xpath: removed a memory leak and fixed the predicate evaluation
+ problem, extended the testsuite and cleaned up the result. XPointer seems
+ broken ...
+
+2.3.5: Mar 23 2001
+
+- Biggest change is separate parsing and evaluation of XPath expressions,
+ there is some new APIs for this too
+- included a number of bug fixes(XML push parser, 51876, notations,
+ 52299)
+- Fixed some portability issues
+
+2.3.4: Mar 10 2001
+
+- Fixed bugs #51860 and #51861
+- Added a global variable xmlDefaultBufferSize to allow default buffer
+ size to be application tunable.
+- Some cleanup in the validation code, still a bug left and this part
+ should probably be rewritten to support ambiguous content model :-\
+- Fix a couple of serious bugs introduced or raised by changes in 2.3.3
+ parser
+- Fixed another bug in xmlNodeGetContent()
+- Bjorn fixed XPath node collection and Number formatting
+- Fixed a loop reported in the HTML parsing
+- blank space are reported even if the Dtd content model proves that they
+ are formatting spaces, this is for XmL conformance
+
+2.3.3: Mar 1 2001
+
+- small change in XPath for XSLT
+- documentation cleanups
+- fix in validation by Gary Pennington
+- serious parsing performances improvements
+
+2.3.2: Feb 24 2001
+
+- chasing XPath bugs, found a bunch, completed some TODO
+- fixed a Dtd parsing bug
+- fixed a bug in xmlNodeGetContent
+- ID/IDREF support partly rewritten by Gary Pennington
+
+2.3.1: Feb 15 2001
+
+- some XPath and HTML bug fixes for XSLT
+- small extension of the hash table interfaces for DOM gdome2
+ implementation
+- A few bug fixes
+
+2.3.0: Feb 8 2001 (2.2.12 was on 25 Jan but I didn't kept track)
+
+- Lots of XPath bug fixes
+- Add a mode with Dtd lookup but without validation error reporting for
+ XSLT
+- Add support for text node without escaping (XSLT)
+- bug fixes for xmlCheckFilename
+- validation code bug fixes from Gary Pennington
+- Patch from Paul D. Smith correcting URI path normalization
+- Patch to allow simultaneous install of libxml-devel and
+ libxml2-devel
+- the example Makefile is now fixed
+- added HTML to the RPM packages
+- tree copying bugfixes
+- updates to Windows makefiles
+- optimisation patch from Bjorn Reese
+
+2.2.11: Jan 4 2001
+
+- bunch of bug fixes (memory I/O, xpath, ftp/http, ...)
+- added htmlHandleOmittedElem()
+- Applied Bjorn Reese's IPV6 first patch
+- Applied Paul D. Smith patches for validation of XInclude results
+- added XPointer xmlns() new scheme support
+
+2.2.10: Nov 25 2000
+
+- Fix the Windows problems of 2.2.8
+- integrate OpenVMS patches
+- better handling of some nasty HTML input
+- Improved the XPointer implementation
+- integrate a number of provided patches
+
+2.2.9: Nov 25 2000
+
+2.2.8: Nov 13 2000
+
+- First version of XInclude
+ support
+- Patch in conditional section handling
+- updated MS compiler project
+- fixed some XPath problems
+- added an URI escaping function
+- some other bug fixes
+
+2.2.7: Oct 31 2000
+
+- added message redirection
+- XPath improvements (thanks TOM !)
+- xmlIOParseDTD() added
+- various small fixes in the HTML, URI, HTTP and XPointer support
+- some cleanup of the Makefile, autoconf and the distribution content
+
+2.2.6: Oct 25 2000:
+
+- Added an hash table module, migrated a number of internal structure to
+ those
+- Fixed a posteriori validation problems
+- HTTP module cleanups
+- HTML parser improvements (tag errors, script/style handling, attribute
+ normalization)
+- coalescing of adjacent text nodes
+- couple of XPath bug fixes, exported the internal API
+
+2.2.5: Oct 15 2000:
+
+- XPointer implementation and testsuite
+- Lot of XPath fixes, added variable and functions registration, more
+ tests
+- Portability fixes, lots of enhancements toward an easy Windows build
+ and release
+- Late validation fixes
+- Integrated a lot of contributed patches
+- added memory management docs
+- a performance problem when using large buffer seems fixed
+
+2.2.4: Oct 1 2000:
+
+- main XPath problem fixed
+- Integrated portability patches for Windows
+- Serious bug fixes on the URI and HTML code
+
+2.2.3: Sep 17 2000
+
+- bug fixes
+- cleanup of entity handling code
+- overall review of all loops in the parsers, all sprintf usage has been
+ checked too
+- Far better handling of larges Dtd. Validating against Docbook XML Dtd
+ works smoothly now.
+
+1.8.10: Sep 6 2000
+- bug fix release for some Gnome projects
+2.2.2: August 12 2000
+
+- mostly bug fixes
+- started adding routines to access xml parser context options
+
+2.2.1: July 21 2000
+
+- a purely bug fixes release
+- fixed an encoding support problem when parsing from a memory block
+- fixed a DOCTYPE parsing problem
+- removed a bug in the function allowing to override the memory
+ allocation routines
+
+2.2.0: July 14 2000
+
+- applied a lot of portability fixes
+- better encoding support/cleanup and saving (content is now always
+ encoded in UTF-8)
+- the HTML parser now correctly handles encodings
+- added xmlHasProp()
+- fixed a serious problem with &
+- propagated the fix to FTP client
+- cleanup, bugfixes, etc ...
+- Added a page about libxml Internationalization
+ support
+
+
+1.8.9: July 9 2000
+
+- fixed the spec the RPMs should be better
+- fixed a serious bug in the FTP implementation, released 1.8.9 to solve
+ rpmfind users problem
+
+2.1.1: July 1 2000
+
+- fixes a couple of bugs in the 2.1.0 packaging
+- improvements on the HTML parser
+
+2.1.0 and 1.8.8: June 29 2000
+
+- 1.8.8 is mostly a comodity package for upgrading to libxml2 accoding to
+ new instructions. It fixes a nasty problem
+ about & charref parsing
+- 2.1.0 also ease the upgrade from libxml v1 to the recent version. it
+ also contains numerous fixes and enhancements:
+
+- added xmlStopParser() to stop parsing
+- improved a lot parsing speed when there is large CDATA blocs
+- includes XPath patches provided by Picdar Technology
+- tried to fix as much as possible DtD validation and namespace
+ related problems
+- output to a given encoding has been added/tested
+- lot of various fixes
+
+
+
+2.0.0: Apr 12 2000
+
+- First public release of libxml2. If you are using libxml, it's a good
+ idea to check the 1.x to 2.x upgrade instructions. NOTE: while initally
+ scheduled for Apr 3 the relase occured only on Apr 12 due to massive
+ workload.
+- The include are now located under $prefix/include/libxml (instead of
+ $prefix/include/gnome-xml), they also are referenced by
+
#include <libxml/xxx.h>
+instead of
+#include "xxx.h"
+
+- a new URI module for parsing URIs and following strictly RFC 2396
+- the memory allocation routines used by libxml can now be overloaded
+ dynamically by using xmlMemSetup()
+- The previously CVS only tool tester has been renamed
+ xmllint and is now installed as part of the libxml2
+ package
+- The I/O interface has been revamped. There is now ways to plug in
+ specific I/O modules, either at the URI scheme detection level using
+ xmlRegisterInputCallbacks() or by passing I/O functions when creating a
+ parser context using xmlCreateIOParserCtxt()
+- there is a C preprocessor macro LIBXML_VERSION providing the version
+ number of the libxml module in use
+- a number of optional features of libxml can now be excluded at
+ configure time (FTP/HTTP/HTML/XPath/Debug)
+
+2.0.0beta: Mar 14 2000
+
+- This is a first Beta release of libxml version 2
+- It's available only fromxmlsoft.org
+ FTP, it's packaged as libxml2-2.0.0beta and available as tar and
+ RPMs
+- This version is now the head in the Gnome CVS base, the old one is
+ available under the tag LIB_XML_1_X
+- This includes a very large set of changes. Froma programmatic point of
+ view applications should not have to be modified too much, check the upgrade page
+
+- Some interfaces may changes (especially a bit about encoding).
+- the updates includes:
+
+- fix I18N support. ISO-Latin-x/UTF-8/UTF-16 (nearly) seems correctly
+ handled now
+- Better handling of entities, especially well formedness checking
+ and proper PEref extensions in external subsets
+- DTD conditional sections
+- Validation now correcly handle entities content
+- change
+ structures to accomodate DOM
+
+
+- Serious progress were made toward compliance, here are the result of the test against the
+ OASIS testsuite (except the japanese tests since I don't support that
+ encoding yet). This URL is rebuilt every couple of hours using the CVS
+ head version.
+
+1.8.7: Mar 6 2000
+
+- This is a bug fix release:
+- It is possible to disable the ignorable blanks heuristic used by
+ libxml-1.x, a new function xmlKeepBlanksDefault(0) will allow this. Note
+ that for adherence to XML spec, this behaviour will be disabled by
+ default in 2.x . The same function will allow to keep compatibility for
+ old code.
+- Blanks in <a> </a> constructs are not ignored anymore,
+ avoiding heuristic is really the Right Way :-\
+- The unchecked use of snprintf which was breaking libxml-1.8.6
+ compilation on some platforms has been fixed
+- nanoftp.c nanohttp.c: Fixed '#' and '?' stripping when processing
+ URIs
+
+1.8.6: Jan 31 2000
+- added a nanoFTP transport module, debugged until the new version of rpmfind can use
+ it without troubles
+1.8.5: Jan 21 2000
+
+- adding APIs to parse a well balanced chunk of XML (production [43] content of the
+ XML spec)
+- fixed a hideous bug in xmlGetProp pointed by Rune.Djurhuus@fast.no
+- Jody Goldberg <jgoldberg@home.com> provided another patch trying
+ to solve the zlib checks problems
+- The current state in gnome CVS base is expected to ship as 1.8.5 with
+ gnumeric soon
+
+1.8.4: Jan 13 2000
+
+- bug fixes, reintroduced xmlNewGlobalNs(), fixed xmlNewNs()
+- all exit() call should have been removed from libxml
+- fixed a problem with INCLUDE_WINSOCK on WIN32 platform
+- added newDocFragment()
+
+1.8.3: Jan 5 2000
+
+- a Push interface for the XML and HTML parsers
+- a shell-like interface to the document tree (try tester --shell :-)
+- lots of bug fixes and improvement added over XMas hollidays
+- fixed the DTD parsing code to work with the xhtml DTD
+- added xmlRemoveProp(), xmlRemoveID() and xmlRemoveRef()
+- Fixed bugs in xmlNewNs()
+- External entity loading code has been revamped, now it uses
+ xmlLoadExternalEntity(), some fix on entities processing were added
+- cleaned up WIN32 includes of socket stuff
+
+1.8.2: Dec 21 1999
+
+- I got another problem with includes and C++, I hope this issue is fixed
+ for good this time
+- Added a few tree modification functions: xmlReplaceNode,
+ xmlAddPrevSibling, xmlAddNextSibling, xmlNodeSetName and
+ xmlDocSetRootElement
+- Tried to improve the HTML output with help from Chris Lahey
+
+
+1.8.1: Dec 18 1999
+
+- various patches to avoid troubles when using libxml with C++ compilers
+ the "namespace" keyword and C escaping in include files
+- a problem in one of the core macros IS_CHAR was corrected
+- fixed a bug introduced in 1.8.0 breaking default namespace processing,
+ and more specifically the Dia application
+- fixed a posteriori validation (validation after parsing, or by using a
+ Dtd not specified in the original document)
+- fixed a bug in
+
+1.8.0: Dec 12 1999
+
+- cleanup, especially memory wise
+- the parser should be more reliable, especially the HTML one, it should
+ not crash, whatever the input !
+- Integrated various patches, especially a speedup improvement for large
+ dataset from Carl Nygard,
+ configure with --with-buffers to enable them.
+- attribute normalization, oops should have been added long ago !
+- attributes defaulted from Dtds should be available, xmlSetProp() now
+ does entities escapting by default.
+
+1.7.4: Oct 25 1999
+
+- Lots of HTML improvement
+- Fixed some errors when saving both XML and HTML
+- More examples, the regression tests should now look clean
+- Fixed a bug with contiguous charref
+
+1.7.3: Sep 29 1999
+
+- portability problems fixed
+- snprintf was used unconditionnally, leading to link problems on system
+ were it's not available, fixed
+
+1.7.1: Sep 24 1999
+
+- The basic type for strings manipulated by libxml has been renamed in
+ 1.7.1 from CHAR to xmlChar. The reason
+ is that CHAR was conflicting with a predefined type on Windows. However
+ on non WIN32 environment, compatibility is provided by the way of a
+ #define .
+- Changed another error : the use of a structure field called errno, and
+ leading to troubles on platforms where it's a macro
+
+1.7.0: sep 23 1999
+
+- Added the ability to fetch remote DTD or parsed entities, see the nanohttp module.
+- Added an errno to report errors by another mean than a simple printf
+ like callback
+- Finished ID/IDREF support and checking when validation
+- Serious memory leaks fixed (there is now a memory wrapper module)
+- Improvement of XPath
+ implementation
+- Added an HTML parser front-end
+
+Daniel Veillard
+ |
|
|
|
+
|
+
+
diff --git a/doc/site.xsl b/doc/site.xsl
new file mode 100644
index 00000000..faedf8b0
--- /dev/null
+++ b/doc/site.xsl
@@ -0,0 +1,354 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Main Menu
+
+ |
+
+
+
+
+ |
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ intro.html
+
+
+ docs.html
+
+
+ bugs.html
+
+
+ help.html
+
+
+ help.html
+
+
+ downloads.html
+
+
+ news.html
+
+
+ contribs.html
+
+
+ xsltproc2.html
+
+
+ API.html
+
+
+ XSLT.html
+
+
+ XML.html
+
+
+ valid.html
+
+
+ tree.html
+
+
+ library.html
+
+
+ interface.html
+
+
+ example.html
+
+
+ entities.html
+
+
+ architecture.html
+
+
+ namespaces.html
+
+
+ DOM.html
+
+
+ unknown.html
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/doc/tree.html b/doc/tree.html
new file mode 100644
index 00000000..d3f942a8
--- /dev/null
+++ b/doc/tree.html
@@ -0,0 +1,110 @@
+
+
+
+
+
+The tree output
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+The tree output
+ |
|
|
+
+
+ |
+
+ The parser returns a tree built during the document analysis. The value
+returned is an xmlDocPtr (i.e., a pointer to an
+xmlDoc structure). This structure contains information such
+as the file name, the document type, and a children pointer
+which is the root of the document (or more exactly the first child under the
+root which is the document). The tree is made of xmlNodes,
+chained in double-linked lists of siblings and with a children<->parent
+relationship. An xmlNode can also carry properties (a chain of xmlAttr
+structures). An attribute may have a value which is a list of TEXT or
+ENTITY_REF nodes.
+Here is an example (erroneous with respect to the XML spec since there
+should be only one ELEMENT under the root):
+
+In the source package there is a small program (not installed by default)
+called xmllint which parses XML files given as argument and
+prints them back as parsed. This is useful for detecting errors both in XML
+code and in the XML parser itself. It has an option --debug
+which prints the actual in-memory structure of the document; here is the
+result with the example given before:
+DOCUMENT
+version=1.0
+standalone=true
+ ELEMENT EXAMPLE
+ ATTRIBUTE prop1
+ TEXT
+ content=gnome is great
+ ATTRIBUTE prop2
+ ENTITY_REF
+ TEXT
+ content= linux too
+ ELEMENT head
+ ELEMENT title
+ TEXT
+ content=Welcome to Gnome
+ ELEMENT chapter
+ ELEMENT title
+ TEXT
+ content=The Linux adventure
+ ELEMENT p
+ TEXT
+ content=bla bla bla ...
+ ELEMENT image
+ ATTRIBUTE href
+ TEXT
+ content=linus.gif
+ ELEMENT p
+ TEXT
+ content=...
+This should be useful for learning the internal representation model.
+Daniel Veillard
+ |
|
|
|
+
|
+
+
diff --git a/doc/valid.html b/doc/valid.html
new file mode 100644
index 00000000..fc24a4b8
--- /dev/null
+++ b/doc/valid.html
@@ -0,0 +1,93 @@
+
+
+
+
+
+Validation, or are you afraid of DTDs ?
+
+
+
+
+
+ |
+
+The XML C library for Gnome
+Validation, or are you afraid of DTDs ?
+ |
|
|
+
+
+ |
+
+ Well what is validation and what is a DTD ?
+Validation is the process of checking a document against a set of
+construction rules; a DTD (Document Type Definition) is such
+a set of rules.
+The validation process and building DTDs are the two most difficult parts
+of the XML life cycle. Briefly a DTD defines all the possibles element to be
+found within your document, what is the formal shape of your document tree
+(by defining the allowed content of an element, either text, a regular
+expression for the allowed list of children, or mixed content i.e. both text
+and children). The DTD also defines the allowed attributes for all elements
+and the types of the attributes. For more detailed information, I suggest
+that you read the related parts of the XML specification, the examples found
+under gnome-xml/test/valid/dtd and any of the large number of books available
+on XML. The dia example in gnome-xml/test/valid should be both simple and
+complete enough to allow you to build your own.
+A word of warning, building a good DTD which will fit the needs of your
+application in the long-term is far from trivial; however, the extra level of
+quality it can ensure is well worth the price for some sets of applications
+or if you already have already a DTD defined for your application field.
+The validation is not completely finished but in a (very IMHO) usable
+state. Until a real validation interface is defined the way to do it is to
+define and set the xmlDoValidityCheckingDefaultValue
+external variable to 1, this will of course be changed at some point:
+extern int xmlDoValidityCheckingDefaultValue;
+...
+xmlDoValidityCheckingDefaultValue = 1;
+
+ To handle external entities, use the function
+xmlSetExternalEntityLoader(xmlExternalEntityLoader f); to
+link in you HTTP/FTP/Entities database library to the standard libxml
+core.
+@@interfaces@@
+Daniel Veillard
+ |
|
|
|
+
|
+
+
diff --git a/doc/xml.html b/doc/xml.html
index e927025c..19f3de84 100644
--- a/doc/xml.html
+++ b/doc/xml.html
@@ -8,14 +8,9 @@
-
-
The XML C library for Gnome
-libxml, a.k.a. gnome-xml
+libxml, a.k.a. gnome-xml
@@ -29,18 +24,7 @@ alt="Red Hat Logo">
XSLT
The tree output
The SAX interface
- The XML library interfaces
-
-
+ The XML library interfaces
Entities or no entities
Namespaces
Validation
@@ -925,7 +909,7 @@ href="http://cvs.gnome.org/lxr/source/libxslt/FEATURES">features
supported and the progresses on the Changelog
-An overview of libxml architecture
+
Libxml is made of multiple components; some of them are optional, and most
of the block interfaces are public. The main components are:
@@ -1686,8 +1670,8 @@ Gnome CVS base under gnome-xml/example
is now the maintainer of the Windows port, he
provides binaries
- Gary Pennington provides
- Solaris
+ Gary Pennington
+ provides Solaris
binaries
Matt
@@ -1714,6 +1698,6 @@ Gnome CVS base under gnome-xml/example
Daniel Veillard
-$Id: xml.html,v 1.112 2001/10/10 09:45:06 veillard Exp $
+$Id: xml.html,v 1.113 2001/10/19 14:50:57 veillard Exp $