15.1    Character Set

(1)   The only characters allowed outside of comments are the graphic_characters and format_effectors.

Syntax

character ::= graphic_character | format_effector 
            | other_control_character
 
graphic_character ::= identifier_letter | digit | space_character 
                    | special_character
 

Semantics

(2)   The character repertoire for the text of an AADL specification consists of the collection of characters called the Basic Multilingual Plane (BMP) of the ISO 10646 Universal Multiple-Octet Coded Character Set, plus a set of format_effectors and, in comments only, a set of other_control_functions; the coded representation for these characters is implementation defined (it need not be a representation defined within ISO-10646-1).

(3)   The description of the language definition in this standard uses the graphic symbols defined for Row 00: Basic Latin and Row 00: Latin-1 Supplement of the ISO 10646 BMP; these correspond to the graphic symbols of ISO 8859-1 (Latin-1); no graphic symbols are used in this standard for characters outside of Row 00 of the BMP.  The actual set of graphic symbols used by an implementation for the visual representation of the text of an AADL specification is not specified.

(4)   The categories of characters are defined as follows:

identifier_letter

upper_case_identifier_letter | lower_case_identifier_letter

upper_case_identifier_letter

Any character of Row 00 of ISO 10646 BMP whose name begins “Latin Capital Letter”.

lower_case_identifier_letter

Any character of Row 00 of ISO 10646 BMP whose name begins “Latin Small Letter”.

digit

One of the characters 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9.

space_character

The character of ISO 10646 BMP named “Space''.

special_character

Any character of the ISO 10646 BMP that is not reserved for a control function, and is not the space_character, an identifier_letter, or a digit.

format_effector

The control functions of ISO 6429 called character tabulation (HT), line tabulation (VT), carriage return (CR), line feed (LF), and form feed (FF).

other_control_character

Any control character, other than a format_effector, that is allowed in a comment; the set of other_control_functions allowed in comments is implementation defined.

(5)   The following names are used when referring to certain special_characters:

Symbol
Name
Symbol
Name
"
quotation mark
:
colon
#
number sign
;
semicolon
=
equals sign
(
left parenthesis
)
Right parenthesis
_
underline
+
plus sign
[
left square bracket
,
Comma
]
right square bracket
-
Minus
{
left curly bracket
.
Dot
}
right curly bracket

Implementation Permissions

(6)   In a nonstandard mode, the implementation may support a different character repertoire; in particular, the set of symbols that are considered identifier_letters can be extended or changed to conform to local conventions.

NOTES:

Every code position of ISO 10646 BMP that is not reserved for a control function is defined to be a graphic_character by this standard. This includes all code positions other than 0000 - 001F, 007F - 009F, and FFFE - FFFF.