|  |  |  | GLib Reference Manual |  | 
|---|---|---|---|---|
#include <glib.h> enum GMarkupError; #define G_MARKUP_ERROR enum GMarkupParseFlags; GMarkupParseContext; GMarkupParser; gchar* g_markup_escape_text (const gchar *text, gssize length); gchar* g_markup_printf_escaped (const char *format, ...); gchar* g_markup_vprintf_escaped (const char *format, va_list args); gboolean g_markup_parse_context_end_parse (GMarkupParseContext *context, GError **error); void g_markup_parse_context_free (GMarkupParseContext *context); void g_markup_parse_context_get_position (GMarkupParseContext *context, gint *line_number, gint *char_number); const gchar* g_markup_parse_context_get_element (GMarkupParseContext *context); const GSList* g_markup_parse_context_get_element_stack (GMarkupParseContext *context); gpointer g_markup_parse_context_get_user_data (GMarkupParseContext *context); GMarkupParseContext* g_markup_parse_context_new (const GMarkupParser *parser, GMarkupParseFlags flags, gpointer user_data, GDestroyNotify user_data_dnotify); gboolean g_markup_parse_context_parse (GMarkupParseContext *context, const gchar *text, gssize text_len, GError **error); void g_markup_parse_context_push (GMarkupParseContext *context, GMarkupParser *parser, gpointer user_data); gpointer g_markup_parse_context_pop (GMarkupParseContext *context); enum GMarkupCollectType; gboolean g_markup_collect_attributes (const gchar *element_name, const gchar **attribute_names, const gchar **attribute_values, GError **error, GMarkupCollectType first_type, const gchar *first_attr, ...);
The "GMarkup" parser is intended to parse a simple markup format that's a subset of XML. This is a small, efficient, easy-to-use parser. It should not be used if you expect to interoperate with other applications generating full-scale XML. However, it's very useful for application data files, config files, etc. where you know your application will be the only one writing the file. Full-scale XML parsers should be able to parse the subset used by GMarkup, so you can easily migrate to full-scale XML at a later time if the need arises.
GMarkup is not guaranteed to signal an error on all invalid XML; the parser may accept documents that an XML parser would not. However, XML documents which are not well-formed[5] are not considered valid GMarkup documents.
Simplifications to XML include:
Only UTF-8 encoding is allowed.
No user-defined entities.
Processing instructions, comments and the doctype declaration are "passed through" but are not interpreted in any way.
No DTD or validation.
The markup format does support:
Elements
Attributes
5 standard entities: & < > " '
Character references
Sections marked as CDATA
typedef enum
{
  G_MARKUP_ERROR_BAD_UTF8,
  G_MARKUP_ERROR_EMPTY,
  G_MARKUP_ERROR_PARSE,
  /* The following are primarily intended for specific GMarkupParser
   * implementations to set.
   */
  G_MARKUP_ERROR_UNKNOWN_ELEMENT,
  G_MARKUP_ERROR_UNKNOWN_ATTRIBUTE,
  G_MARKUP_ERROR_INVALID_CONTENT,
  G_MARKUP_ERROR_MISSING_ATTRIBUTE
} GMarkupError;
Error codes returned by markup parsing.
| text being parsed was not valid UTF-8 | |
| document contained nothing, or only whitespace | |
| document was ill-formed | |
| error should be set by GMarkupParser functions; element wasn't known | |
| error should be set by GMarkupParser functions; attribute wasn't known | |
| error should be set by GMarkupParser functions; content was invalid | |
| error should be set by GMarkupParser functions; a required attribute was missing | 
#define G_MARKUP_ERROR g_markup_error_quark ()
Error domain for markup parsing. Errors in this domain will be from the GMarkupError enumeration. See GError for information on error domains.
typedef enum
{
  G_MARKUP_DO_NOT_USE_THIS_UNSUPPORTED_FLAG = 1 << 0,
  G_MARKUP_TREAT_CDATA_AS_TEXT              = 1 << 1,
  G_MARKUP_PREFIX_ERROR_POSITION            = 1 << 2
} GMarkupParseFlags;
Flags that affect the behaviour of the parser.
| flag you should not use. | |
| When this flag is set, CDATA marked
  sections are not passed literally to the passthroughfunction of
  the parser. Instead, the content of the section (without the<![CDATA[and]]>) is
  passed to thetextfunction. This flag was added in GLib 2.12. | |
| Normally errors caught by GMarkup itself have line/column information prefixed to them to let the caller know the location of the error. When this flag is set the location information is also prefixed to errors generated by the GMarkupParser implementation functions. | 
typedef struct _GMarkupParseContext GMarkupParseContext;
A parse context is used to parse a stream of bytes that you expect to
contain marked-up text. See g_markup_parse_context_new(),
GMarkupParser, and so on for more details.
typedef struct {
  /* Called for open tags <foo bar="baz"> */
  void (*start_element)  (GMarkupParseContext *context,
                          const gchar         *element_name,
                          const gchar        **attribute_names,
                          const gchar        **attribute_values,
                          gpointer             user_data,
                          GError             **error);
  /* Called for close tags </foo> */
  void (*end_element)    (GMarkupParseContext *context,
                          const gchar         *element_name,
                          gpointer             user_data,
                          GError             **error);
  /* Called for character data */
  /* text is not nul-terminated */
  void (*text)           (GMarkupParseContext *context,
                          const gchar         *text,
                          gsize                text_len,  
                          gpointer             user_data,
                          GError             **error);
  /* Called for strings that should be re-saved verbatim in this same
   * position, but are not otherwise interpretable.  At the moment
   * this includes comments and processing instructions.
   */
  /* text is not nul-terminated. */
  void (*passthrough)    (GMarkupParseContext *context,
                          const gchar         *passthrough_text,
                          gsize                text_len,  
                          gpointer             user_data,
                          GError             **error);
  /* Called on error, including one set by other
   * methods in the vtable. The GError should not be freed.
   */
  void (*error)          (GMarkupParseContext *context,
                          GError              *error,
                          gpointer             user_data);
} GMarkupParser;
Any of the fields in GMarkupParser can be NULL, in which case they
will be ignored. Except for the error function, any of these
callbacks can set an error; in particular the
G_MARKUP_ERROR_UNKNOWN_ELEMENT, G_MARKUP_ERROR_UNKNOWN_ATTRIBUTE,
and G_MARKUP_ERROR_INVALID_CONTENT errors are intended to be set 
from these callbacks. If you set an error from a callback,
g_markup_parse_context_parse() will report that error back to its caller.
| 
 | Callback to invoke when the opening tag of an element is seen. | 
| 
 | Callback to invoke when the closing tag of an element is seen.
    Note that this is also called for empty tags like <empty/>. | 
| 
 | Callback to invoke when some text is seen (text is always
    inside an element). Note that the text of an element may be spread
    over multiple calls of this function. If the G_MARKUP_TREAT_CDATA_AS_TEXTflag is set, this function is also called for the content of CDATA marked 
    sections. | 
| 
 | Callback to invoke for comments, processing instructions 
    and doctype declarations; if you're re-writing the parsed document, 
    write the passthrough text back out in the same position. If the G_MARKUP_TREAT_CDATA_AS_TEXTflag is not set, this function is also 
    called for CDATA marked sections. | 
| 
 | Callback to invoke when an error occurs. | 
gchar* g_markup_escape_text (const gchar *text, gssize length);
Escapes text so that the markup parser will parse it verbatim. Less than, greater than, ampersand, etc. are replaced with the corresponding entities. This function would typically be used when writing out a file to be parsed with the markup parser.
Note that this function doesn't protect whitespace and line endings from being processed according to the XML rules for normalization of line endings and attribute values.
| 
 | some valid UTF-8 text | 
| 
 | length of textin bytes, or -1 if the text is nul-terminated | 
| Returns : | a newly allocated string with the escaped text | 
gchar* g_markup_printf_escaped (const char *format, ...);
Formats arguments according to format, escaping
all string and character arguments in the fashion
of g_markup_escape_text(). This is useful when you
want to insert literal strings into XML-style markup
output, without having to worry that the strings
might themselves contain markup.
const char *store = "Fortnum & Mason";
const char *item = "Tea";
char *output;
 
output = g_markup_printf_escaped ("<purchase>"
                                  "<store>%s</store>"
                                  "<item>%s</item>"
                                  "</purchase>",
                                  store, item);
| 
 | printf()style format string | 
| 
 | the arguments to insert in the format string | 
| Returns : | newly allocated result from formatting
 operation. Free with g_free(). | 
Since 2.4
gchar* g_markup_vprintf_escaped (const char *format, va_list args);
Formats the data in args according to format, escaping
all string and character arguments in the fashion
of g_markup_escape_text(). See g_markup_printf_escaped().
| 
 | printf()style format string | 
| 
 | variable argument list, similar to vprintf() | 
| Returns : | newly allocated result from formatting
 operation. Free with g_free(). | 
Since 2.4
gboolean g_markup_parse_context_end_parse (GMarkupParseContext *context, GError **error);
Signals to the GMarkupParseContext that all data has been
fed into the parse context with g_markup_parse_context_parse().
This function reports an error if the document isn't complete,
for example if elements are still open.
| 
 | a GMarkupParseContext | 
| 
 | return location for a GError | 
| Returns : | TRUEon success,FALSEif an error was set | 
void g_markup_parse_context_free (GMarkupParseContext *context);
Frees a GMarkupParseContext. Can't be called from inside one of the GMarkupParser functions. Can't be called while a subparser is pushed.
| 
 | a GMarkupParseContext | 
void g_markup_parse_context_get_position (GMarkupParseContext *context, gint *line_number, gint *char_number);
Retrieves the current line number and the number of the character on that line. Intended for use in error messages; there are no strict semantics for what constitutes the "current" line number other than "the best number we could come up with for error messages."
| 
 | a GMarkupParseContext | 
| 
 | return location for a line number, or NULL | 
| 
 | return location for a char-on-line number, or NULL | 
const gchar* g_markup_parse_context_get_element (GMarkupParseContext *context);
Retrieves the name of the currently open element.
If called from the start_element or end_element handlers this will
give the element_name as passed to those functions. For the parent
elements, see g_markup_parse_context_get_element_stack().
| 
 | a GMarkupParseContext | 
| Returns : | the name of the currently open element, or NULL | 
Since 2.2
const GSList* g_markup_parse_context_get_element_stack (GMarkupParseContext *context);
Retrieves the element stack from the internal state of the parser.
The returned GSList is a list of strings where the first item is
the currently open tag (as would be returned by
g_markup_parse_context_get_element()) and the next item is its
immediate parent.
This function is intended to be used in the start_element and
end_element handlers where g_markup_parse_context_get_element()
would merely return the name of the element that is being
processed.
| 
 | a GMarkupParseContext | 
| Returns : | the element stack, which must not be modified | 
Since 2.16
gpointer g_markup_parse_context_get_user_data (GMarkupParseContext *context);
Returns the user_data associated with context.  This will either
be the user_data that was provided to g_markup_parse_context_new()
or to the most recent call of g_markup_parse_context_push().
| 
 | a GMarkupParseContext | 
| Returns : | the provided user_data. The returned data belongs to
    the markup context and will be freed when g_markup_context_free()is called. | 
Since 2.18
GMarkupParseContext* g_markup_parse_context_new (const GMarkupParser *parser, GMarkupParseFlags flags, gpointer user_data, GDestroyNotify user_data_dnotify);
Creates a new parse context. A parse context is used to parse marked-up documents. You can feed any number of documents into a context, as long as no errors occur; once an error occurs, the parse context can't continue to parse text (you have to free it and create a new parse context).
| 
 | a GMarkupParser | 
| 
 | one or more GMarkupParseFlags | 
| 
 | user data to pass to GMarkupParser functions | 
| 
 | user data destroy notifier called when the parse context is freed | 
| Returns : | a new GMarkupParseContext | 
gboolean g_markup_parse_context_parse (GMarkupParseContext *context, const gchar *text, gssize text_len, GError **error);
Feed some data to the GMarkupParseContext. The data need not be valid UTF-8; an error will be signaled if it's invalid. The data need not be an entire document; you can feed a document into the parser incrementally, via multiple calls to this function. Typically, as you receive data from a network connection or file, you feed each received chunk of data into this function, aborting the process if an error occurs. Once an error is reported, no further data may be fed to the GMarkupParseContext; all errors are fatal.
| 
 | a GMarkupParseContext | 
| 
 | chunk of text to parse | 
| 
 | length of textin bytes | 
| 
 | return location for a GError | 
| Returns : | FALSEif an error occurred,TRUEon success | 
void g_markup_parse_context_push (GMarkupParseContext *context, GMarkupParser *parser, gpointer user_data);
Temporarily redirects markup data to a sub-parser.
This function may only be called from the start_element handler of
a GMarkupParser.  It must be matched with a corresponding call to
g_markup_parse_context_pop() in the matching end_element handler
(except in the case that the parser aborts due to an error).
All tags, text and other data between the matching tags is
redirected to the subparser given by parser.  user_data is used
as the user_data for that parser.  user_data is also passed to the
error callback in the event that an error occurs.  This includes
errors that occur in subparsers of the subparser.
The end tag matching the start tag for which this call was made is
handled by the previous parser (which is given its own user_data)
which is why g_markup_parse_context_pop() is provided to allow "one
last access" to the user_data provided to this function.  In the
case of error, the user_data provided here is passed directly to
the error callback of the subparser and g_markup_parse_context()
should not be called.  In either case, if user_data was allocated
then it ought to be freed from both of these locations.
This function is not intended to be directly called by users interested in invoking subparsers. Instead, it is intended to be used by the subparsers themselves to implement a higher-level interface.
As an example, see the following implementation of a simple parser that counts the number of tags encountered.
typedef struct
{
  gint tag_count;
} CounterData;
static void
counter_start_element (GMarkupParseContext  *context,
                       const gchar          *element_name,
                       const gchar         **attribute_names,
                       const gchar         **attribute_values,
                       gpointer              user_data,
                       GError              **error)
{
  CounterData *data = user_data;
  data->tag_count++;
}
static void
counter_error (GMarkupParseContext *context,
               GError              *error,
               gpointer             user_data)
{
  CounterData *data = user_data;
  g_slice_free (CounterData, data);
}
static GMarkupParser counter_subparser =
{
  counter_start_element,
  NULL,
  NULL,
  NULL,
  counter_error
};
In order to allow this parser to be easily used as a subparser, the following interface is provided:
void
start_counting (GMarkupParseContext *context)
{
  CounterData *data = g_slice_new (CounterData);
  data->tag_count = 0;
  g_markup_parse_context_push (context, &counter_subparser, data);
}
gint
end_counting (GMarkupParseContext *context)
{
  CounterData *data = g_markup_parse_context_pop (context);
  int result;
  result = data->tag_count;
  g_slice_free (CounterData, data);
  return result;
}
The subparser would then be used as follows:
static void start_element (context, element_name, ...)
{
  if (strcmp (element_name, "count-these") == 0)
    start_counting (context);
  /* else, handle other tags... */
}
static void end_element (context, element_name, ...)
{
  if (strcmp (element_name, "count-these") == 0)
    g_print ("Counted %d tags\n", end_counting (context));
  /* else, handle other tags... */
}
| 
 | a GMarkupParseContext | 
| 
 | a GMarkupParser | 
| 
 | user data to pass to GMarkupParser functions | 
Since 2.18
gpointer g_markup_parse_context_pop (GMarkupParseContext *context);
Completes the process of a temporary sub-parser redirection.
This function exists to collect the user_data allocated by a
matching call to g_markup_parse_context_push().  It must be called
in the end_element handler corresponding to the start_element
handler during which g_markup_parse_context_push() was called.  You
must not call this function from the error callback -- the
user_data is provided directly to the callback in that case.
This function is not intended to be directly called by users interested in invoking subparsers. Instead, it is intended to be used by the subparsers themselves to implement a higher-level interface.
| 
 | a GMarkupParseContext | 
| Returns : | the user_data passed to g_markup_parse_context_push(). | 
Since 2.18
typedef enum
{
  G_MARKUP_COLLECT_INVALID,
  G_MARKUP_COLLECT_STRING,
  G_MARKUP_COLLECT_STRDUP,
  G_MARKUP_COLLECT_BOOLEAN,
  G_MARKUP_COLLECT_TRISTATE,
  G_MARKUP_COLLECT_OPTIONAL = (1 << 16)
} GMarkupCollectType;
A mixed enumerated type and flags field.  You must specify one type
(string, strdup, boolean, tristate).  Additionally, you may
optionally bitwise OR the type with the flag
G_MARKUP_COLLECT_OPTIONAL.
It is likely that this enum will be extended in the future to support other types.
| used to terminate the list of attributes to collect. | |
| collect the string pointer directly from
                          the attribute_values[] array.  Expects a
                          parameter of type (const char **).  If G_MARKUP_COLLECT_OPTIONALis specified
                          and the attribute isn't present then the
                          pointer will be set toNULL. | |
| as with G_MARKUP_COLLECT_STRING, but
                          expects a paramter of type (char **) andg_strdup()s the returned pointer.  The
                          pointer must be freed withg_free(). | |
| expects a parameter of type (gboolean *)
                           and parses the attribute value as a
                           boolean.  Sets FALSEif the attribute
                           isn't present.  Valid boolean values
                           consist of (case insensitive) "false",
                           "f", "no", "n", "0" and "true", "t",
                           "yes", "y", "1". | |
| as with G_MARKUP_COLLECT_BOOLEAN, but
                            in the case of a missing attribute a
                            value is set that compares equal to
                            neitherFALSEnorTRUE.
                            G_MARKUP_COLLECT_OPTIONAL is implied. | |
| can be bitwise ORed with the other fields. If present, allows the attribute not to appear. A default value is set depending on what value type is used. | 
gboolean g_markup_collect_attributes (const gchar *element_name, const gchar **attribute_names, const gchar **attribute_values, GError **error, GMarkupCollectType first_type, const gchar *first_attr, ...);
Collects the attributes of the element from the data passed to the GMarkupParser start_element function, dealing with common error conditions and supporting boolean values.
This utility function is not required to write a parser but can save a lot of typing.
The element_name, attribute_names,
attribute_values and error parameters passed
to the start_element callback should be passed
unmodified to this function.
Following these arguments is a list of
"supported" attributes to collect.  It is an
error to specify multiple attributes with the
same name.  If any attribute not in the list
appears in the attribute_names array then an
unknown attribute error will result.
The GMarkupCollectType field allows specifying the type of collection to perform and if a given attribute must appear or is optional.
The attribute name is simply the name of the attribute to collect.
The pointer should be of the appropriate type
(see the descriptions under
GMarkupCollectType) and may be NULL in case a
particular attribute is to be allowed but
ignored.
This function deals with issuing errors for missing attributes 
(of type G_MARKUP_ERROR_MISSING_ATTRIBUTE), unknown attributes 
(of type G_MARKUP_ERROR_UNKNOWN_ATTRIBUTE) and duplicate 
attributes (of type G_MARKUP_ERROR_INVALID_CONTENT) as well 
as parse errors for boolean-valued attributes (again of type
G_MARKUP_ERROR_INVALID_CONTENT). In all of these cases FALSE 
will be returned and error will be set as appropriate.
| 
 | the current tag name | 
| 
 | the attribute names | 
| 
 | the attribute values | 
| 
 | a pointer to a GError or NULL | 
| 
 | the GMarkupCollectType of the first attribute | 
| 
 | the name of the first attribute | 
| 
 | a pointer to the storage location of the
      first attribute (or NULL), followed by
      more types names and pointers, ending
      withG_MARKUP_COLLECT_INVALID. | 
| Returns : | TRUEif successful | 
Since 2.16