Ticket #9 (closed defect: fixed)

Opened 8 years ago

Last modified 8 years ago

Forcing block style

Reported by: edemaine@… Owned by: xi
Priority: normal Component: pyyaml
Severity: normal Keywords:


Is there an easy way to force the emitter to use block style instead of flow style? I have one particular case in mind where it would be particularly desirable: ordered dictionaries. For example:

>>> yaml.load('[hello: world, goodbye: world]')
[{'hello': 'world'}, {'goodbye': 'world'}]
>>> print yaml.dump(_)
- {hello: world}
- {goodbye: world}

In my opinion, the output would look much nicer as

- hello: world
- goodbye: world

Even if you don't agree with this opinion, there should be a way to force block style in all output. I did not see an easy way to do this, even with subclassing. Suggestions?


Change History

comment:1 Changed 8 years ago by xi

  • Status changed from new to assigned

In principle, you are right, there are no easy way to control output style using high-level API. PyYAML tries to produce reasonable output, but certainly it cannot satisfy everyone.

I might add a global flag dump(data, flow_output=True|False), but I don't really believe it will fix the problem. What is needed is a way to apply a certain style to a particular object in the object tree.

Your example can be fixed with subclassing:

>>> data = yaml.load('[hello: world, goodbye: world]')
>>> class MyDumper(yaml.Dumper):
...     def represent_mapping(self, tag, mapping, flow_style=False):
...         return yaml.Dumper.represent_mapping(self, tag, mapping, flow_style)
>>> print yaml.dump(data, Dumper=MyDumper)
- hello: world
- goodbye: world

comment:2 Changed 8 years ago by edemaine@…

Ah, that subclassing is not too bad. Thanks for the pointer. I didn't realize that it's the representer's job to decide flow vs. block. So I could also use the add_representer interface (because I'm already using one, for a subclass of dict).

In general, more documentation on the representer vs. serializer vs. emitter (vs. dumper) would be helpful. I'm very new to the codebase, and it's taking a little while to absorb. (Although, thankfully, this code is nice and short!)

Looking at the code more, it would be nice to have more control over decisions about style (in particular, the various "style" and "flow_style" arguments in the representer) from an external interface. We could imagine, for example, a Style object that, given the object being represented, produces the desired style. Then there's a FlowStyle? subclass that always answers 'flow' for mapping types. Instead of a single function, I imagine the object would be registerable interfaces with style_str, style_mapping, etc. methods like you have with add_representer. The key idea is to add flexibility to specify style at a general level (and plausibly to adapt even to the values being represented and some amount of the context).

It seems that the current style is to represent the innermost lists and dictionaries as flow style, and others as block. (This is just from experimentation; I don't actually see where the decision is being made. It seems it's not in representer...)

This proposal is a little different from "tagging" particular objects with the style you want. I am actually looking more for control over general styling, not specific objects. Essentially I am proposing to separate two tasks currently burdened on the representer (and perhaps other components): picking a style and converting that into nodes.

Incidentally, perhaps it would help if we could unify on a single specification of "style": '|' vs. 'flow' vs. 'block' vs. etc. Currently, some are strings and some are booleans, depending on the object type. It's not too bad, but cleaning this up may make it easier to think about these problems.

comment:3 Changed 8 years ago by xi

  • Status changed from assigned to closed
  • Resolution set to fixed

Fixed in [152].

Representer chooses the style of a collection node as follows:

  1. If the code contains only scalar subnotes, its style is set to 'flow'.
  2. Otherwise it is set to 'block'.

I've added two flags that override the default style for scalar and collection nodes: default_style and default_flow_style.

>>> print yaml.dump(['foo', 'bar'], default_style='|', default_flow_style=False)
- |-
- |-

I think it will do for now, although a more sofisticated method may be added in the future.

I'm closing the ticket, but if you have interesting ideas about applying style to nodes, feel free to post them there.


Add a comment

Modify Ticket

Change Properties
<Author field>
as closed
The resolution will be deleted. Next status will be 'reopened'

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.