Conditional processing (profiling)

Conditional processing, also known as profiling, is the filtering or flagging of information based on processing-time criteria.

DITA tries to implement conditional processing in a semantically meaningful way: rather than allowing arbitrary values to accumulate in a document over time in a general-purpose processing attribute, with meaning only to the original author, we encourage the authoring of metadata using specific metadata attributes on content. These metadata values can then be leveraged by any number of processes, including filtering, flagging, search, and indexing, rather than being suitable for filtering only.

Conditional processing attributes

For a topic or topicref, the audience, platform, and product metadata can be expressed with attributes on the topic or topicref element or with elements within the topic prolog or topicmeta element. While the metadata elements are more expressive, the meaning of the values is the same, and can be used in coordination: for example, the prolog elements can fully define the audiences for a topic, and then metadata attributes can be used within the content to identify parts that apply to only some of those audiences.

audience
The values from the enumerated attributes of the audience metadata element have the same meaning when used in the audience attribute of a content element. For instance, the "user" value has the same meaning whether appearing in the type attribute of the audience element for a topic or in the audience attribute of a content element. The principle applies to the type, job, and experience level attributes of the audience element.

The values in the audience attribute may also be used to reference a more complete description of an audience in an audience element. Use the name of the audience in the audience element when referring to the same audience in an audience attribute.

The audience attribute takes a blank-delimited list of values, which may or may not match the name value of any audience elements.

platform
The platform might be the operating system, hardware, or other environment. This attribute is equivalent to the platform element for the topic metadata.

The platform attribute takes a blank-delimited list of values, which may or may not match the content of a platform element in the prolog.

product
The product or component name, version, brand, or internal code or number. This attribute is equivalent to the prodinfo element for the topic metadata.

The product attribute takes a blank-delimited list of values, which may or may not match the value of the prodname element in the prolog.

rev
The identifier for the revision level. For example, if a paragraph was changed or added during revision 1.1, the rev attribute might contain the value "1.1".
otherprops
A catchall for metadata qualification values about the content. This attribute is equivalent to the othermeta element for the topic metadata.

The attribute takes a blank-delimited list of values, which may or may not match the values of othermeta elements in the prolog.

For example, a simple otherprops value list: <codeblock otherprops="java cpp">

The attribute can also take labelled groups of values, but this syntax is deprecated in DITA 1.1 in favor of attribute specialization. The labelled group syntax is similar to the generalized attribute syntax and may cause confusion for processors. A labelled group consists of a string value followed by an open parenthesis followed by one or more blank-delimited values followed by a close parenthesis. The simple format is sufficient when an information set requires only one additional metadata axis, in addition to the base metadata attributes of product, platform, and audience. The full format is similar to attribute specialization in that it allows two or more additional metadata axes. For example, a complex otherprops value list: <codeblock otherprops="proglang(java cpp) commentformat(javadoc html)">

props
A generic attribute for conditional processing values. In DITA 1.1, the props attribute can be specialized to create new conditional processing attributes.

Using metadata attributes

Each attribute takes zero or more space-delimited string values. For example, you can use the product attribute to identify that an element applies to two particular products.

Figure 1. Example source
<p audience="administrator">Set the configuration options:
 <ul>
  <li product="extendedprod">Set foo to bar</li>
  <li product="basicprod extendedprod">Set your blink rate</li>
  <li>Do some other stuff</li>
  <li platform="Linux">Do a special thing for Linux</li>
 </ul>
</p>

Processing metadata attributes

At processing time, you specify the values you want to exclude and the values you want to flag using a conditional processing profile (described in the DITA Language Specification). For example, a publisher producing information for a mixed audience using the basic product could choose to flag information that applies to administrators, and exclude information that applies to the extended product, and express those choices in a conditional processing profile like this:
<prop att="audience" val="administrator" action="flag" >
  <startflag>ADMIN</startflag>
</prop>
<prop att="product"  val="extendedprod"  action="exclude"/>

At output time, the paragraph is flagged, and the first list item is excluded (since it applies to extendedprod), but the second list item is still included (even though it does apply to extendedprod, it also applies to basicprod, which was not excluded).

The result should look something like:
ADMIN Set the configuration options:
  • Set your blink rate
  • Do some other stuff
  • Do a special thing for Linux

Filtering logic

When deciding whether to exclude a particular element, a process should evaluate each attribute, and then evaluate the set of attributes:
  • If all the values in an attribute have been set to "exclude", the attribute evaluates to "exclude"
  • If any of the attributes evaluate to exclude, the element is excluded.
For example, if a paragraph applies to three products and the publisher has chosen to exclude all of them, the process should exclude the paragraph; even if the paragraph applies to an audience or platform that you aren't excluding. But if the paragraph applies to an additional product that has not been excluded, then its content is still relevant for the intended output and should be preserved.

Flagging logic

When deciding whether to flag a particular element, a process should evaluate each value. Wherever a value that has been set as flagged appears in its attribute (for example, audience="ADMIN") the process should add the flag. When multiple flags apply to a single element, multiple flags should be output, typically in the order they are encountered.

Flagging could be done using text (for example, bold text against a colored background) or using images. When the same element evaluates as both flagged and filtered (for example, flagged because of an audience attribute value and filtered because of its product attribute values), the element should be filtered.