Modify ↓
Ticket #129 (closed defect: fixed)
Incorrect Unicode BOM generation
| Reported by: | Valentin Nechayev <netchv@…> | Owned by: | xi |
|---|---|---|---|
| Priority: | normal | Component: | pyyaml |
| Severity: | normal | Keywords: | |
| Cc: |
Description
py-YAML 3.07, with Python 2.5 and FreeBSD (package name py25-yaml-3.07_2)
When yaml.dump() generates stream in utf-16be or utf-16le, it generates byte-order mark (BOM), but makes it incorrectly. Example:
>>> yaml.dump("xyz", encoding = 'utf-16be')
'\x00\xff\x00\xfe\x00x\x00y\x00z\x00\n\x00.\x00.\x00.\x00\n'
Instead, it should generate:
'\xfe\xff\x00x\x00y\x00z\x00\n\x00.\x00.\x00.\x00\n'
Fix:
--- 01/PyYAML-3.07/lib/yaml/emitter.py 2008-12-29 01:36:32.000000000 +0200
+++ work/PyYAML-3.07/lib/yaml/emitter.py 2009-06-06 16:48:39.000000000 +0300
@@ -787,7 +787,7 @@
def write_stream_start(self):
# Write BOM if needed.
if self.encoding and self.encoding.startswith('utf-16'):
- self.stream.write(u'\xFF\xFE'.encode(self.encoding))
+ self.stream.write(u'\uFEFF'.encode(self.encoding))
def write_stream_end(self):
self.flush_stream()
P.S. I guess it also should generate BOMs for utf-32*
Attachments
Change History
Note: See
TracTickets for help on using
tickets.

Thank you for the report and the fix. Fixed in [351].
UTF-32 is not supported by the YAML specification.