Modify

Ticket #59 (closed defect: worksforme)

Opened 7 years ago

Last modified 42 hours ago

Load yaml data as utf-8 strings into a dictionary

Reported by: dukebody@… Owned by: xi
Priority: normal Component: pyyaml
Severity: normal Keywords:
Cc:

Description (last modified by xi) (diff)

Hello, I'm looking for support, but I don't know if this is the right place to ask.

I'm using PyYAML to load some nested data from an utf-8 encoded file in Python with:

import yaml
stream=file('data.yaml','r')
data=yaml.load(stream)

The data variable becomes a dictionary with unicode values where is needed. What I want is to put utf-8 strings instead of unicode values to use this data with another library: Cheetah. If I try to use the unicode-type dictionary generated by default with PyYAML I get an UnicodeDecodeError, because de Cheetah strings are in iso-8859-15 and Python tries to decode them to Unicode using ASCII charset tables, so it obiously fails.

Is there any way to get an utf-8 coded dictionary?

Attachments

Change History

comment:1 Changed 7 years ago by xi

  • Status changed from new to assigned
  • Description modified (diff)

comment:2 Changed 7 years ago by xi

  • Status changed from assigned to closed
  • Resolution set to worksforme

You need to specify an alternative constructor for the tag !!str. You can do it as follows:

import yaml
def custom_str_constructor(loader, node):
    return loader.construct_scalar(node).encode('utf-8')
yaml.add_constructor(u'tag:yaml.org,2002:str', custom_str_constructor)

Example:

>>> import yaml
>>> yaml.load("Кирилл")
u'\u041a\u0438\u0440\u0438\u043b\u043b'
>>> def custom_str_constructor(loader, node):
...     return loader.construct_scalar(node).encode('utf-8')
... 
>>> yaml.add_constructor(u'tag:yaml.org,2002:str', custom_str_constructor)
>>> yaml.load("Кирилл")
'\xd0\x9a\xd0\xb8\xd1\x80\xd0\xb8\xd0\xbb\xd0\xbb'

comment:3 Changed 42 hours ago by maskodok <galihadiputro87@…>

The only thing more I could hope for is documentation of all these features (other than reading through the code).  Cipto Junaedy Is this in process? Can I help? About  Unit Link Terbaik di Indonesia Commonwealth Life Investra Link

View

Add a comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
The resolution will be deleted. Next status will be 'reopened'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.