2 Parsing VRML97 and VRML-encoded X3D

OpenVRML includes parsers for reading VRML97 and VRML-encoded X3D. Prior to version 0.17.0, users wanting to use OpenVRML to load VRML or X3D had to load the file into an openvrml::browser and traverse the runtime's scene hierarchy. While this approach is still possible, OpenVRML now provides parsers that can be used separately from openvrml::browser. Using these parsers directly is more efficient since the OpenVRML runtime machinery doesn't need to be started. And it many cases, it will also be easier to implement. These parsers are implemented using the Spirit parsing framework and have the following features:

2.1 Using Spirit parsers

Spirit is a framework of C++ templates that facilitates parser creation directly in C++ using a BNF-like syntax. Using Spirit, you can create simple parsers for fragments of syntax and combine them to create sophisticated parsers for complete grammars. Such grammar parsers are typically packaged in a class (or class template) called a “grammar capsule”. openvrml::vrml97_grammar and openvrml::x3d_vrml_grammar are such grammar capsules.

Grammar capsules are a particular form of Spirit parser. A Spirit parser is a protocol (or “concept”); that is, it is a set of lexical requirements to which a type must conform. Those requirements are detailed in the Spirit documentation; for our purposes, what's important is that Spirit parsers are things that can be used with Spirit's parse function template:

 namespace boost {
   namespace spirit {
     namespace classic {
       template <typename IteratorT, typename ParserT, typename SkipT>
       parse_info<IteratorT>
       parse(const IteratorT &       first,
             const IteratorT &       last,
             const parser<ParserT> & p,
             const parser<SkipT> &   skip);
     }
   }
 }

In a pattern that should be familiar to users of the C++ standard library, parse accepts a pair of iterators that denote the range of data to operate on. This function is used as follows:

 using boost::spirit::classic::parse;

 std::vector<char> my_data;
 my_skip_parser my_skip;
 my_grammar my_g;

 if (!parse(my_data.begin(), my_data.end(), my_g, my_skip).full) {
     return EXIT_FAILURE;
 }

In the above example, we are checking the full member of the parse_info<IteratorT> struct that parse returns; full is true in the case of a “full match”, and false otherwise.

The other part of this example which we've glossed over until now is the “skip parser”, my_skip. Spirit uses the skip parser to identify text that should be discarded prior to parsing: typically, whitespace and comments. openvrml::vrml97_skip_grammar is a skip parser appropriate for use with VRML97 and Classic VRML X3D.

Note:
X3D 3.2 introduces a block comment syntax to the Classic VRML encoding. OpenVRML does not yet include a skip parser capable of parsing these comments.

2.2 position_iterator

Substituting a few elements of the example, we're well on our way to something that actually parses VRML:

 using boost::spirit::classic::parse;

 std::string my_data = "#VRML 2.0 utf8\n"
                       "Group {}\n";
 openvrml::vrml97_skip_grammar my_skip;
 openvrml::vrml97_grammar<> my_g;

 if (!parse(my_data.begin(), my_data.end(), my_g, my_skip).full) {
     return EXIT_FAILURE;
 }

There's just one more critical thing missing from the above example. This code is using the default error handler. (Error handlers are covered in detail in 2.5 Error handling.) The default error handler requires that the iterators used for parsing be position_iterators.

The position_iterator keeps track of positional information; and the default error handler uses this information to emit error messages that include the file name and line and column numbers. The position_iterator works as an adapter for an existing iterator type:

 # include <openvrml/vrml97_grammar.h>

 int main()
 {
     using boost::spirit::classic::parse;
     using boost::spirit::classic::position_iterator;

     std::string my_data = "#VRML 2.0 utf8\n"
                           "Group {}\n";
     openvrml::vrml97_skip_grammar my_skip;
     openvrml::vrml97_grammar<> my_g;

     typedef position_iterator<std::string::iterator> iterator_t;

     iterator_t first(my_data.begin(), my_data.end(), "vrmlstring"), last;

     if (!parse(first, last, my_g, my_skip).full) {
         return EXIT_FAILURE;
     }
     return EXIT_SUCCESS;
 }

The beginning position_iterator takes three arguments: the first two are the beginning and ending underlying iterators; the last is a string that will be used as the file name. Since we aren't parsing an actual file here, "vrmlstring" is used for the third argument.

The ending position_iterator is simply a default-constructed one.

2.3 Parsing streams

The previous examples demonstrated parsing text in a std::vector or a std::string; and while such an approach is sometimes useful, more commonly one is interested in parsing the contents of a stream.

Due to Spirit's requirements to support backtracking, it is not possible simply to hand the parse function an input iterator like std::istreambuf_iterator. Instead, Spirit provides an iterator adapter called multi_pass that can work with such iterators.

 # include <openvrml/vrml97_grammar.h>

 int main()
 {
     using std::istreambuf_iterator;
     using boost::spirit::classic::parse;
     using boost::spirit::classic::position_iterator;
     using boost::spirit::classic::multi_pass;
     using boost::spirit::classic::make_multi_pass;

     typedef multi_pass<istreambuf_iterator<char> > multi_pass_iterator_t;

     multi_pass_iterator_t
         in_begin(make_multi_pass(istreambuf_iterator<char>(std::cin))),
         in_end(make_multi_pass(istreambuf_iterator<char>()));

     typedef position_iterator<multi_pass_iterator_t> iterator_t;

     iterator_t first(in_begin, in_end, "<stdin>"), last;

     openvrml::vrml97_skip_grammar my_skip;
     openvrml::vrml97_grammar<> my_g;

     if (!parse(first, last, my_g, my_skip).full) {
         return EXIT_FAILURE;
     }
     return EXIT_SUCCESS;
 }

Here, we wrap an istreambuf_iterator in a multi_pass iterator, and then wrap that in a position_iterator. Once we have the position_iterators, we pass them to parse exactly as before.

The above example, when compiled, will parse VRML97 from standard input.

2.4 Semantic actions

Semantic actions are how the parsing process delivers data to user code.

openvrml::vrml97_grammar and openvrml::x3d_vrml_grammar are in fact class templates; though in the examples up to now we've simply relied on the default values for their template arguments. These templates take two arguments: Actions and ErrorHandler. The second of these arguments is addressed in the next section, 2.5 Error handling. The first is an object the bundles the semantic actions to be executed during the parse.

Actions is a struct that contains semantic actions in the form of function objects. The default value of the Actions template parameter is openvrml::null_vrml97_parse_actions for openvrml::vrml97_grammar, and openvrml::null_x3d_vrml_parse_actions for openvrml::x3d_vrml_grammar. These “null” parse actions structs provide a no-op function for each semantic action invoked by the respective grammar. You can inherit these in your own semantic actions struct to avoid having to define a function object for every action yourself.

The following example expands on our existing one to add a semantic action that simply prints the node types of encountered nodes to the standard output.

 # include <iostream>
 # include <openvrml/vrml97_grammar.h>

 struct actions : openvrml::null_vrml97_parse_actions {
     struct on_node_start_t {
         void operator()(const std::string & node_name_id,
                         const std::string & node_type_id) const
         {
             std::cout << node_type_id << std::endl;
         }
     } on_node_start;
 };

 int main()
 {
     using std::istreambuf_iterator;
     using boost::spirit::classic::parse;
     using boost::spirit::classic::position_iterator;
     using boost::spirit::classic::multi_pass;
     using boost::spirit::classic::make_multi_pass;

     typedef multi_pass<istreambuf_iterator<char> > multi_pass_iterator_t;

     multi_pass_iterator_t
         in_begin(make_multi_pass(istreambuf_iterator<char>(std::cin))),
         in_end(make_multi_pass(istreambuf_iterator<char>()));

     typedef position_iterator<multi_pass_iterator_t> iterator_t;

     iterator_t first(in_begin, in_end, "<stdin>"), last;

     openvrml::vrml97_skip_grammar my_skip;
     actions my_actions;
     openvrml::vrml97_grammar<actions> my_g(my_actions);

     if (!parse(first, last, my_g, my_skip).full) {
         return EXIT_FAILURE;
     }
     return EXIT_SUCCESS;
 }

The pretty_print.cpp example program provides a more thorough demonstration of the semantic action facility.

2.5 Error handling

By default, the parsers emit error and warning messages to stderr. Where messages are printed (if they are printed at all) and how the parsers respond to errors is controlled using a Spirit error handler.

As mentioned in the previous section, the second template argument to the grammar class templates is an ErrorHandler. The ErrorHandler is a function object of the form:

 struct error_handler {
     template <typename ScannerT, typename ErrorT>
     boost::spirit::error_status<> operator()(ScannerT, ErrorT) const
     {
     ...
     }
 };

openvrml::vrml97_grammar defaults to using openvrml::vrml97_parse_error_handler; and openvrml::x3d_vrml_grammar defaults to using openvrml::x3d_vrml_parse_error_handler. For most common cases, these types will be quite sufficient. Their constructors take an std::ostream; so, to emit output somewhere other than stderr, one need only construct an instance of one of these error handlers with the desired std::ostream and pass that to the grammar capsule constructor.

2.6 Further information

The objective of this portion of OpenVRML's manual has been to provide a high-level view of parsing with the provided Spirit grammars. There is a great deal of depth and flexibility built into the Spirit framework; and while exploring that is beyond the scope of this manual, the Spirit User's Guide is an excellent resource.