entrez

Since 2008 I've been using a script (entrez.py) to search PubMed from the command line. Yesterday I decided it was time to clean up the script and make it publicly available. It turned out to be reasonably complicated, so I just reimplemented the guts using the Python SOAP library Suds working with the Entrez SOAP interface. I haven't been able to find the original that I started with, but it seems to be related to Pyblio/Query.py in early versions of pybliographer.

$ ./entrez.py -L
# available databases:
pubmed
protein
nuccore
...
$ ./entrez.py -X
database: pubmed
description: PubMed bibliographic record
available fields:
 ALL    All Fields   All terms from all searchable fields
 UID    UID          Unique number assigned to publication
FILT    Filter       Limits the records
TITL    Title        Words in title of publication
...
$ ./entrez.py -X -F AUTH
field AUTH in pubmed:
Description     Author(s) of publication
   FullName     Author
  Hierarchy     N
     IsDate     N
   IsHidden     N
IsNumerical     N
       Name     AUTH
SingleToken     Y
  TermCount     11526565
$ ./entrez.py -v 'king[au]+yang[au]+2010[dp]+monte[tit]'
entrezpy: INFO   run eEsearch on pubmed
entrezpy: INFO   search returned 1 of 1 items
entrezpy: INFO   run eFetch on pubmed
entrezpy: INFO   convert medline XML to BibTeX
@Article{King2010,
  author =       "William T. King and Meihong Su and Guoliang Yang",
  title =        "Monte Carlo simulation of mechanical unfolding of
                 proteins based on a simple two-state model.",
  journal =      "International journal of biological macromolecules",
  ...
  doi =          "10.1016/j.ijbiomac.2009.12.001",
  URL =          "http://www.ncbi.nlm.nih.gov/pubmed/20004685",
  language =     "eng",
}

The BibTeX conversion uses bibutils and bibclean for the medline-to-BibTeX conversion.