| Version 4 (modified by anonymous, 5 years ago) (diff) |
|---|
Bugs in YAML specification
Bugs in the Type Library
!!float: The regular expression matches a single dot ".".
Bugs in Examples
Example 2.17. Hex: decimal instead of hex format
hexesc: "\x13\x10 is \r\n"
Correct:
hexesc: "\x0D\x0A is \r\n"
Example 2.19. Integers: invalid comma in
decimal: +12,345
Correct:
decimal: +12_345
Example 2.20. Floating Point: invalid comma in
fixed: 1,230.15
Correct:
fixed: 1_230.15
Example 10.7. Flow Mapping Keys: missing comma in
{
?° : value # Empty key
and
simple key : value
Example 9.33. Final Empty Lines: extra !!seq [ in
--- !!seq [ !!str "folded line\n\
The same bug in
- Example 9.25. Literal Scalar,
- Example 9.29. Folded Scalar,
- Example 9.30. Folded Lines,
- Example 9.31. Spaced Lines,
- Example 9.32. Empty Separation Lines.
Example 8.1. Node Properties: a simple key should be limited to a single line:
!!str &a1 "foo" : !!str bar
Example 8.15. Completely Empty Block Nodes: Either remove seq: from the left side, or add it to the right side.
The same example: extra comma in
: bar,
Example 5.3. Block Structure Indicators:
? sky : blue ? sea : green
is interpreted as
!!map {
? !!str "sky" : !!str "blue",
? !!str "sea" : !!str "green",
}
Correct:
!!map {
? !!str "sky" : !!str "blue",
? !!map { ? !!str "sea" : !!str "green" } : !!null "",
}
Example 6.7. Empty Lines: invalid indicators in
!!seq {
Example 7.8. Tag Handles: missing comma in
!<tag:yaml.org,2002:str> "string"
Example 8.1. Node Properties: different case of anchor and alias:
? &A1 !!str "foo"
and
: *a1
Anchors are case sensitive, right?
Example 8.9. Flow Content: the sequence sequence is invalid:
? !!str "sequence" : !!seq [
? !!str "entry",
: !!map {
? !!str "key" : !!str "value"
} ],
Correct:
? !!str "sequence" : !!seq [
!!str "entry",
!!map {
? !!str "key" : !!str "value"
} ],
Example 8.9. Flow Content: extra : in
? !!str "collections": : !!map {
and in
? !!str "mapping": : !!map {
Example 8.13. Completely Empty Flow Nodes: extra comma in
? !!str "", : !!str "bar",
Example 8.15. Completely Empty Block Nodes: extra comma in
? !!str "",
: !!str "bar",
Example 9.15. Document Marker Scalar Content: extra comma in
? !!str "...", : !!str "bar"
Example 9.23. Block Scalar Chomping: should be !!map { instead of !seq [ in
!!seq [ ? !!str "strip" : !!str "# text",
Example 9.24. Empty Scalar Chomping: should be !!map { instead of !seq [ in
!!seq [ ? !!str "strip" : !!str "",
Example 10.5. Block Sequence Entry Types: missing comma after ] in
!!str "two",
]
!!map {
Example 10.12. Block Mappings: extra comma in
? !!str "key",
Example 10.15. In-Line Block Mappings: Right side - { instead of [:
%YAML 1.1
---
!!seq {
Example 5.14. Escaped Characters: Right side - should be \x0C in the line
\x22 \x07 \x08 \x1B \0C
Example 10.10. Flow Mapping Key: Value Pairs: The tag for empty scalars should be !null, not !str. Compare:
? explicit key3, # Empty value
with
? !!str "explicit key3" : !!str "",
Correct:
? !!str "explicit key3" : !!null "",
The same problem in
- Example 7.10. Documents,
- Example 8.13. Completely Empty Flow Nodes,
- Example 8.15. Completely Empty Block Nodes,
- Example 10.5. Block Sequence Entry Types,
- Example 10.7. Flow Mapping Keys,
- Example 10.9. Flow Mapping Values,
- Example 10.11. Single Pair Mappings,
- Example 10.13. Explicit Block Mapping Entries,
- Example 10.14. Simple Block Mapping Entries,
Example 9.12. Plain Characters: Left side has not comma before and. Compare
- Up, up and away!
with
!!str "Up, up, and away!",
Correct:
- Up, up, and away!
Not bugs, but might be invalid in the future
Example 9.12. Plain Characters. : might be prohibited for plain scalars in the flow context:
# Outside flow collection: - ::std::vector - Up, up and away! - -123 # Inside flow collection: - [ ::std::vector, "Up, up and away!", -123 ]
Empty content might be prohibited in the flow context.
Example 8.12. Flow Nodes in Flow Context:
[ Without properties, &anchor "Anchored", !!str 'Tagged', *anchor, # Alias node !!str, # Empty plain scalar ]
Example 10.7. Flow Mapping Keys
{
?° : value # Empty key
? explicit
key: value,
simple key : value
[ collection, simple, key ]: value
}
There is a problem when you allow constructions like
{ !!str, }
because URIs may include the , and : characters. Suppose that we use a Perl-specific tag like !!perl/ref/YAML::Parser in
{ !!perl/ref/YAML::Parser, }
How it should be interpreted? Is :: a delimiter or a part of the URI? The above examples may be easily rewritten
{
? ~ : value # Empty key
}
[ *anchor, # Alias node '', # Empty plain scalar ]
Anchors and Tags eat too much
It's easier to see this problem for Anchors. Problematic rules are:
c-ns-alias-node ::= “*” ns-anchor-name c-ns-anchor-property ::= “&” ns-anchor-name ns-anchor-name ::= ns-char+
ns-char will eat too much. It's bad, especially for flow collections because it may eat the delimiter comma.
The following example clearly shows the ambiguity:
[ &alias, value ]
It can be parsed both as ["", "value"] and as value?.
Compare it with an example from the spec:
Example 8.12. Flow Nodes in Flow Context:
[ Without properties, &anchor "Anchored", !!str 'Tagged', *anchor, # Alias node !!str, # Empty plain scalar ]
Solution: restrict ns-anchor-name.
The ns-uri-char definition allows commas so Tags have the same problem. Unfortunately you cannot just forbid commas because of the tags like <tag:yaml.org,2002:str>.
A probable solution is to allow commas only between < and >. Another solution is to forbid empty plain scalars in flow context.
Another solution: require s-separate after Tags, Anchors, and Aliases. It's not really intuitive since it will force you to write [ *alias_, foo] instead of [ *alias, foo].
Now I think the best solution is to restrict ns-anchor-name to nb-plain-char-in+ and to forbid empty scalar content in flow collections. So you should write
[ Without properties, &anchor "Anchored", !!str 'Tagged', *anchor, # Alias node !!str '', # Empty plain scalar ]
Line break is required before block collections
According to the spec, a document must have at least one leading line break before the real content. It makes simple documents such as Example 2.1 invalid:
- Mark McGwire - Sammy Sosa - Ken Griffey
The relevant production rules:
l-implicit-document ::= s-ignored-space*
ns-l+block-node(-1,block-in)
l-document-suffix?
ns-l+block-node(n,c) ::= ns-l+block-in-block(n,c)
| ns-l+flow-in-block(n,c)
ns-l+block-in-block(n,c) ::= ( c-ns-properties(n+1,c) s-separate(n+1,c) )?
c-l+block-content(n,c)
c-l+block-content(n,c) ::= c-l+block-scalar(n)
| c-l-block-collection(>n,c)
c-l-block-collection(n,c) ::= c-l-block-sequence(n,c) | c-l-block-mapping(n)
c-l-block-sequence(n,c) ::= c-l-comments l-block-seq-entry(n,c)+
The last rule is the one that requires a line break since c-l-comments always ended with a line break.
