[referencer] about pubmed plugin

Aurélien Naldi aurelien.naldi at gmail.com
Mon Mar 24 12:43:35 EDT 2008


On Mon, Mar 24, 2008 at 5:22 PM, John Spray <jcspray at icculus.org> wrote:
>
>  On Mon, 2008-03-24 at 17:09 +0100, Aurélien Naldi wrote:
>  > Exception: <type 'exceptions.UnicodeDecodeError'>
>  > Explication: 'ascii' codec can't decode byte 0xc3 in position 1:
>  > ordinal not in range(128)
>  >
>  >
>  > The following fixes it for me, I hope it doesn't create any other kind
>  > of problem...
>  >
>  >
>  >       print "DOI ", query, " has PubMed ID ", id
>  >
>  > -     return get_citation_from_pmid (id)
>  > +     return get_citation_from_pmid (id.encode("utf-8"))
>
>  That's pretty strange: the pmid is just a number, so utf-8 and ascii
>  would have the same representation.  Which implies that minidom was
>  giving us something other than either of those.  Your system isn't
>  configured for something crazy like UTF-16 is it?
>
>  The python stuff is potentially rife with encoding stuff that I haven't
>  thought about.  One of the pitfalls of being English: ASCII was enough
>  for us! ;-)

Yes it is pretty weird and no, I'm not using UTF-16, I have been using
utf-8 only for a long time...
The strangest thing is that it fails after urlencode (it did print a
correct url for me, it only failed to download it)

I am looking at the returned xml file right now and I don't see
anything strange explaining this, but I saw something interresting:


		<TermSet>
			<Term>10.1093/bioinformatics/btm547[All Fields]</Term>
			<Field>All Fields</Field>
			<Count>1</Count>
			<Explode>Y</Explode>
		</TermSet>


so I tried appending [doi] to the searched doi and it did work, the
result is the same except for this part which now says:

		<TermSet>
			<Term>10.1093/bioinformatics/btm547[doi]</Term>
			<Field>doi</Field>
			<Count>1</Count>
			<Explode>Y</Explode>
		</TermSet>


I guess it solves your other problem !

PS: i don't get any update when doing "hg update", did you apply my
previous patch to the main tree already ?

Best regards

-- 
Aurélien Naldi


More information about the referencer mailing list