Ticket #36 (closed defect: invalid)
PyYaml and WebWare Don't Play Nicely...
| Reported by: | stef@… | Owned by: | xi |
|---|---|---|---|
| Priority: | high | Component: | pyyaml |
| Severity: | blocker | Keywords: | pyyaml webware threading |
| Cc: |
Description (last modified by xi) (diff)
Hello,
Sorry to ask a stupid question, but, is PyYaml? thread-safe ? I am trying to run PyYaml? under WebWare? and I get a session identifier. Due to the nature of WebWare? being servlet based, I find that it sometimes calls 'get session' more than once during the lifetime of processing the servlet. The code that get's run more than once is;
def __getitem__(self, key): if debug: print '>> get (%s)' % key filename = self.filenameForKey(key) self._lock.acquire() try: try: file = open(filename) except IOError: raise KeyError, key try: try: item = yaml.load(file) finally: file.close() except: # session can't be unpickled os.remove(filename) # remove session file print "Error loading session from disk:", key self.application().handleException() raise KeyError, key finally: self._lock.release() return itemWhen I put wrapper print statements around the yaml.load, I notice that the first time it works without fail, the next time I get this;
>> get (20061016171933-d0e40d4c65b62dffbbe4c3d2b21922a5) RIGHT BEFORE YAML.load for <open file '/usr/local/web_work/Sessions/20061016171933-d0e40d4c65b62dffbbe4c3d2b21922a5.ses', mode 'r' at 0xb71798d8> (/usr/local/web_work/Sessions/20061016171933-d0e40d4c65b62dffbbe4c3d2b21922a5.ses) [Mon Oct 16 17:20:28 2006] [error] WebKit: Error while executing script /usr/local/web_work/Compass/RebookHotelPage.py Traceback (most recent call last): File "/usr/local/webware-cvs/WebKit/Application.py", line 436, in dispatchRawRequest self.runTransaction(trans) File "/usr/local/webware-cvs/WebKit/Application.py", line 487, in runTransaction self.runTransactionViaServlet(servlet, trans) File "/usr/local/webware-cvs/WebKit/Application.py", line 512, in runTransactionViaServlet servlet.runTransaction(trans) File "WebKit/Servlet.py", line 41, in runTransaction File "/usr/local/webware-cvs/WebKit/Transaction.py", line 108, in awake self._servlet.awake(self) File "/usr/local/web_work/Compass/RebookHotelPage.py", line 74, in awake print " %s - %s " % ( self, self.session() ) File "WebKit/Page.py", line 151, in session File "/usr/local/webware-cvs/WebKit/Transaction.py", line 69, in session self._session = self._application.createSessionForTransaction(self) File "/usr/local/webware-cvs/WebKit/Application.py", line 312, in createSessionForTransaction session = self.session(sessId) File "/usr/local/webware-cvs/WebKit/Application.py", line 279, in session return self._sessions[sessionId] File "/usr/local/webware-cvs/WebKit/SessionFileStore.py", line 59, in __getitem__ myItem = yaml.load(file) File "build/bdist.linux-i686/egg/yaml/__init__.py", line 66, in load File "build/bdist.linux-i686/egg/yaml/constructor.py", line 38, in get_data File "build/bdist.linux-i686/egg/yaml/constructor.py", line 50, in construct_document File "build/bdist.linux-i686/egg/yaml/constructor.py", line 393, in construct_yaml_seq File "build/bdist.linux-i686/egg/yaml/constructor.py", line 120, in construct_sequence File "build/bdist.linux-i686/egg/yaml/constructor.py", line 96, in construct_object File "build/bdist.linux-i686/egg/yaml/constructor.py", line 572, in construct_python_object File "build/bdist.linux-i686/egg/yaml/constructor.py", line 551, in make_python_instance TypeError: __new__() takes exactly 2 arguments (1 given)
So, from my total layman's perspective, it appears that the init.py is expected to be called 'once', yet due to WebWare?'s threading, I think it get's called more than once. Either way, I could be totally and whole heartedly wrong, any ideas ?
Priority is 'high' as I am on a strict timeline to understand what's going on, and severity is definitely 'blocker', as without a resolution, WebWare? can't use PyYaml?. Which, is kinda weird, but, at least in my experience :)
Regards and Thanks Stef
Attachments
Change History
comment:2 Changed 7 years ago by xi
- Status changed from new to assigned
Hmm... It's an interesting question. I believe that yaml.load() and yaml.dump() are thread-safe, but yaml.add_constructor() and yaml.add_representer() are not.
Anyway this error doesn't look like related to threads. Are you sure that the file is not modified between the first and the second calls of yaml.load()? It rather looks like some Python object cannot be constructed from a yaml node.
You may use the following "patch" to detect what node causes the exception:
import yaml, yaml.constructor old_make_python_instance = yaml.constructor.Constructor.make_python_instance def my_make_python_instance(self, suffix, node, args=None, kwds=None, newobj=False): try: old_make_python_instance(self, suffix, node, args, kwds, newobj) except TypeError: print suffix, node, args, kwds, newobj raise yaml.constructor.Constructor.make_python_instance = my_make_python_instance
comment:3 Changed 7 years ago by stef@…
Hello again,
Well, it definitely is curiouser and curiouser (said alice). It transpires that YAML is also serialising other objects that are stored into the session that are my own class's, which, is what I would expect. However, the new in my own classes (say for example Service), only take args, they don't take (args, *kwds). This is why the instantation of them is 'barfing out' with '2 parameters given but only 1 expected in new()'
I have changed my code to merely pass in the id's, as this is unique to my class and even after un-marshalling the objects, I am still instantiating them anyway (hence the new call above ;). By only storing the id's, then I can YAML away to my heart's content :)
I am even unsure as to how to fix this for other people in the future. It would require doing some introspection on the class type itself to see what parameters new expects. My perl OO and Ruby are strong, but, less so in python. Is this even achievable ?
Irregardless, thanks for your help and pointing me in the right track. I leave the bug's status upto you, but I can see this tripping up other people as well. Regards Stef
comment:4 Changed 7 years ago by xi
- Status changed from assigned to closed
- Resolution set to invalid
When serializing/deserializing Python object, PyYAML follows the pickle protocol v2 (see http://www.python.org/dev/peps/pep-0307/). You must ensure that your objects support:
>>> out = pickle.dumps(obj, protocol=2) >>> obj = pickle.loads(out)
Most likely, you need to provide a custom __reduce__ function to make your objects work correctly.
I'm closing the ticket, but feel free to post any questions here or reopen it if you find any object that (de)serializable with pickle, but cannot be loaded/dumped using PyYAML.
