[referencer] getting metadata from pubmed and some comments

Leonardo Fontenelle leo.fontenelle at gmail.com
Tue Nov 6 18:33:53 EST 2007


I'm not (yet) a regular Referencer user, but I'm very interested in
PubMed integration.

I was following the thread and noticed the "family name, given name"
part. I won't be able to say much about it, but I'd like to suggest
this reading:

http://rishida.net/blog/?p=100

Best regards!

Leonardo Fontenelle
http://leonardof.org

2007/11/6, Aurélien Naldi <aurelien.naldi at gmail.com>:
> Hi,
>
> I'm a computer scientist by formation now working in bioinformatics.
> As such I am dealing with tons of biology papers. I have found
> referencer to be a great tool, the metadata fetching through crossref
> is nice, but I never got more than the familly name of the first
> author in the author field. Pubmed has much more complete metadata for
> the papers I am currently dealing with, I would thus like to know if
> adding support for pubmed into referencer is possible.
>
> I have just looked at how to get metadata through pubmed, here is a
> quick introduction:
>
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&retmax=20&retstart=0&term=<!your_search_term!>
>
> pointing to this will give you a list of matches under this form:
>
>
> <eSearchResult>
>         <Count>1</Count>
>         <RetMax>1</RetMax>
>         <RetStart>0</RetStart>
>         <IdList>
>                 <Id>17581588</Id>
>         </IdList>
>         <TranslationSet>
>         </TranslationSet>
>         <TranslationStack>
>                 <TermSet>
>                         <Term>10.1038/nature05970[All Fields]</Term>
>                         <Field>All Fields</Field>
>                         <Count>1</Count>
>                         <Explode>Y</Explode>
>                 </TermSet>
>                 <OP>GROUP</OP>
>         </TranslationStack>
>         <QueryTranslation>10.1038/nature05970[All Fields]</QueryTranslation>
> </eSearchResult>
>
>
> The important part is the IdList, it gives the list of PMID matching
> with the search. To get more data on a particular entry, use this URL:
>
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&retmode=xml&rettype=citation&id=<!PMID!>
>
> The result is a huge XML file with a real list of author, abstract,
> and much more.
> Some documentation (which I have not really read yet) is available at
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.brief
>
> I have not seen an explicit search by doi but searching a doi does
> work (without the "doi:" prefix)
>
> AFAIK, PDFs unfortunatly don't include a PMID, but this gives much
> better results than crossref for biology papers...
>
> While I am at it, I have some (naive) questions about your XML format:
> * Is it referencer-specific ?
> * What are its advantages over bibtex XML (or other similar stuff) ?
> It seems to deal better with "tags" (bibtexxml has keywords) and
> pdffilenames (bibtexxml  has only a relative path, when exported with
> jabref) and to add the "manage_target" thing, that I do not use (yet).
> Is it anything else ?
> * I see only one "authors" field, which is way too "bibtex like" for
> my taste. Having a clean separation of authors and being able to split
> family name and given name looks nice to me.
> Is it possible to extend the format to deal with this ?
>
> And a final coment: some of my pdf files did not contain a doi entry,
> when adding a whole directory, I got one error dialog for each of
> them. It would be much more useful to remember the list of problematic
> files and to show the list at the end of the process. Giving them a
> "this thing need work" tag could be nice also, what do you think about
> this ?
>
> Thanks for your work on this nice tool!
>
> Best regards.
>
> --
> Aurélien Naldi
>


More information about the referencer mailing list