The CEDAR priorites for HepML are:
- To develop a standard XML format for the parameters of Monte Carlo event generators. This should allow the state of a generator to be configured, catalogued and reproduced. See GeneratorSchema for more details.
- To develop an XML format for I/O of HepData experimental data records. See HepDataSchema for more details.
In developing these we have made some UtilitySchemas which may be used by both the HepDataSchema and GeneratorSchema. There is some duplication here with schemas developed under the LCG HepML namespace and we will work to bring these into line in a common set of schemas.
In addition to a format definition, the schema provides optional elementary data validation, e.g.
- Enumeration constraints applied to the value of an attribute, for example, units quoted for a measurement of Energy should be GeV, MeV, eV, etc.
- Unique constraints; for example a <code>generatorSettings</code> element should contain only one parameter of a certain type with a specific name.
The whole schema set will be developed in tandem with other HepML schemas such as those for Les Houches accord information exchange between generators, generators event records and other utlities. The central documentation for this is on these CERN wiki pages.
The following diagram illustates the current structure of the CEDAR HepML schema and how it is used to define, for example, generator specific schema (diagram not yet updated for merger with LCG).
The top level schema hepml.xsd defines separate namespaces under which independent development can take place. It is planned to create a central namespace where stable and widely agreed schema can eventually be placed.
The HepML schema can beused either on its own to validate a HepML document or it can be used to derive schemas which are restricted versions of the HepML schema which in turn can be used to validate the type of HepML document that the restricted schema is designed for.
The Pythia generator is always run with one random number seed. This is specified using one instance of the "seed" element in the pythia HepML output file (see for example pythia6.xml). The Herwig generator can be run with up to two random seeds. This is specified using two instances of the "seed" element in the herwig HepML output file (see herwig.xml). The top-level HepML schema (hepml.xsd) specifies that the output HepML file should contain a maximum of two "seed" elements. Therefore, with respect to the "seed" element both the pythia and herwig output files will validate against the top-level HepML schema. However it may be desirable to validate the pythia output file against a schema which explicitly requires that there is only one instance of the "seed" element. This is done by defining a new schema (for example pythia6.xsd) which inherits all the definitions from the top-level HepML schema (hepml.xsd) but in addition restricts the "seed" element to just one occurance. In the same way a herwig specific schema (for example fherwig.xsd) can be created to restrict the "seed" element to up to two occurances. A Pythia HepML file will validate against the Pythia schema but not the Herwig schema. Similarly, the Herwig output HepML file will validate against the Herwig schema but not the Pythia schema. However, both will validate against the HepML schema.
Any HepML document will always validate against the top-level HepML schema and the restricted schemas only contain restrictions on the elements which are already defined in the top-level HepML schema. Any extensions that need to be made are done so by making additions to the current subschemas or defining a new subschema.
A brief discussion of the various schemas and subschemas is now given.
HepML top-level schema
Here is the XML schema defining the top-level hepml element and its possible contents:
See the discussion about reactions and particles in HepML at
XML Schema Validation
One quick way to validate an XML document against a Schema is to download the Xerces2 Java-based XML parser and run the dom.Writer class from the command line. There is also a C++-based Xerces XML parser.
A simpler, and equally quick, way is to use a good XML editor such as Oxygen.
HepML in action
The links below show where HepML is currently used in the HEP community
- HZSteer: The generator parameters for Herwig and Pythia can be written in HepML, using the routines HZHRWXML and HZPYTXML
- Jetweb: Generator information read and processed using HepML. The relevant classes are in
- cedar.jetweb.xml (they also do AIDA I/O for histograms).
Working on the HepML project
If you have any suggestions for the enhancement of the HepML project or indeed you are interested in working on the project itself then please feel free to contact the HepML project team at hepml@….
CEDAR HepML and LCG HepML Namespaces
Currently these two namespaces are independent under a top level schema. As things progress, the LCG and CEDAR developers will attempt to make use of commonalities and remove duplication as and when it is appropriate.