Save (0)









Binghe Wang, Series Editor

Computer Applications in Pharmaceutical Research and Development
Edited by Sean Ekins






Copyright © 2006 by John Wiley & Sons, Inc. All rights reserved

Published by John Wiley & Sons, Inc., Hoboken, New Jersey
Published simultaneously in Canada

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in
any form or by any means, electronic, mechanical, photocopying, recording, scanning, or
otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright
Act, without either the prior written permission of the Publisher, or authorization through
payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222
Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at Requests to the Publisher for permission should be addressed to the
Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030,
(201) 748-6011, fax (201) 748-6008, or online at

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their
best efforts in preparing this book, they make no representations or warranties with respect
to the accuracy or completeness of the contents of this book and specifically disclaim any
implied warranties of merchantability or fitness for a particular purpose. No warranty may be
created or extended by sales representatives or written sales materials. The advice and
strategies contained herein may not be suitable for your situation. You should consult with a
professional where appropriate. Neither the publisher nor author shall be liable for any loss
of profi t or any other commercial damages, including but not limited to special, incidental,
consequential, or other damages.

For general information on our other products and services or for technical support, please
contact our Customer Care Department within the United States at (800) 762-2974, outside
the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears
in print may not be available in electronic formats. For more information about Wiley
products, visit our web site at

Library of Congress Cataloging-in-Publication Data:

Computer applications in pharmaceutical research and development / [edited by] Sean Ekins.
p. ; cm.—(Wiley series in drug discovery and development)

Includes bibliographical references and index.
ISBN-13: 978-0-471-73779-7 (cloth)
ISBN-10: 0-471-73779-8 (cloth)
1. Pharmacy—Data processing. 2. Pharmacology—Data processing. 3. Pharmaceutical

industry—Data processing. I. Ekins, Sean. II. Series.
[DNLM: 1. Drug Industry. 2. Medical Informatics. 3. Drug Approval—methods.

4. Drug Evaluation—methods. 5. Drug Evaluation, Preclinical—methods.
QV 26.5 C7374 2006]

RS122.2.C66 2006


Printed in the United States of America
10 9 8 7 6 5 4 3 2 1

For Rosalynd

Failures are not something to be avoided. You want them to happen as quickly
as you can so you can make progress rapidly

—Gordon Moore








1. History of Computers in Pharmaceutical Research and
Development: A Narrative 3
Donald B. Boyd and Max M. Marsh

2. Computers as Data Analysis and Data Management Tools in
Preclinical Development 51
Weiyong Li and Kenneth Banks

3. Statistical Modeling in Pharmaceutical Research and
Development 67
Andrea de Gaetano, Simona Panunzi, Benoit Beck, and
Bruno Boulanger


4. Drug Discovery from Historic Herbal Texts 105
Eric J. Buenz




5. Contextualizing the Impact of Bioinformatics on Preclinical
Drug and Vaccine Discovery 121
Darren R. Flower

6. Systems Approaches for Pharmaceutical Research and
Development 139
Sean Ekins and Craig N. Giroux


7. Information Management—Biodata in Life Sciences 169
Richard K. Scott and Anthony Parsons

8. Chemoinformatics Techniques for Processing Chemical
Structure Databases 187
Valerie J. Gillet and Peter Willett

9. Electronic Laboratory Notebooks 209
Alfred Nehme and Robert A. Scoffi n

10. Strategies for Using Information Effectively in Early-Stage
Drug Discovery 229
David J. Wild

11. Improving the Pharmaceutical R&D Process: How
Simulation Can Support Management Decision Making 247
Andrew Chadwick, Jonathan Moore, Maggie A.Z. Hupcey,
and Robin Purshouse


12. Computers and Protein Crystallography 277
David J. Edwards and Roderick E. Hubbard

13. Computers, Cheminformatics, and the Medicinal Chemist 301
Weifan Zheng and Michael Jones

14. The Challenges of Making Useful Protein-Ligand Free
Energy Predictions for Drug Discovery 321
Jun Shimada

15. Computer Algorithms for Selecting Molecule Libraries
for Synthesis 353
Konstantin V. Balakin, Nikolay P. Savchuk, and Alex Kiselyov

16. Success Stories of Computer-Aided Design 377
Hugo Kubinyi



17. Pharmaceutical Research and Development Productivity:
Can Software Help? 425
Christophe G. Lambert and S. Stanley Young


18. Computer Methods for Predicting Drug Metabolism 445
Sean Ekins

19. Computers in Toxicology and Risk Assessment 469
John C. Dearden

20. Computational Modeling of Drug Disposition 495
Cheng Chang and Peter W. Swaan

21. Computer Simulations in Pharmacokinetics and
Pharmacodynamics: Rediscovering Systems Physiology
in the 21st Century 513
Paolo Vicini

22. Predictive Models for Better Decisions: From Understanding
Physiology to Optimizing Trial Design 529
James R. Bosley, Jr.


23. Making Pharmaceutical Development More Efficient 557
Michael Rosenberg and Richard Farris

24. Use of Interactive Software in Medical Decision Making 571
Renée J. Goldberg Arnold


25. Clinical Data Collection and Management 593
Mazen Abdellatif

26. Regulation of Computer Systems 633
Sandy Weinberg

27. A New Paradigm for Analyzing Adverse Drug Events 649
Ana Szarfman, Jonathan G. Levine, and Joseph M. Tonning




28. Computers in Pharmaceutical Formulation 679
Raymond C. Rowe and Elizabeth A. Colbourn

29. Legal Protection of Innovative Uses of Computers in R&D 703
Robert Harrison

30. The Ethics of Computing in Pharmaceutical Research 715
Matthew K. McGowan and Richard J. McGowan

31. The UltraLink: An Expert System for Contextual Hyperlinking
in Knowledge Management 729
Martin Romacker, Nicolas Grandjean, Pierre Parisot,
Olivier Kreim, Daniel Cronenberger, Thérèse Vachon,
and Manuel C. Peitsch

32. Powerful, Predictive, and Pervasive: The Future of
Computers in the Pharmaceutical Industry 753
Nick Davies, Heather Ahlborn, and Stuart Henderson




In less than a generation we have seen the impressive impact of computer
science on many fields, which has changed not only the ways in which we
communicate in business but also the processes in industry from product
manufacturing to sales and marketing. Computing has had a wide influence
by implementation of predictions based on statistics, mathematics, and risk
assessment algorithms. These predictions or simulations represent a way to
rapidly make decisions, prototype, innovate, and, importantly, learn quickly
from failure. The computer is really just a facilitator using software and a user
interface to lower the threshold of entry for individuals to benefit from
complex fields such as mathematics, statistics, physics, biology, chemistry, and
engineering. Without necessarily having to be an expert in these fields the
user can take advantage of the software for the desired goal whether in the
simulation of a process or for visualization and interpretation of results from
analytical hardware.

Within the pharmaceutical industry we have progressed from the point
where computers in the laboratory were rarely present or used beyond spread-
sheet calculations. Now computers are ubiquitous in pharmaceutical research
and development laboratories, and nearly everyone has at least one used in
some way to aid in his or her role. It should come as no surprise that the
development of hardware and software over the last 30 years has expanded
the scope of computer use to virtually all stages of pharmaceutical research
and development (data analysis, data capture, monitoring and decision
making). Although there are many excellent books published that are focused
on in-depth discussions of computer-aided drug design, bioinformatics, or
other related individual topics, none has addressed this broader utilization of




computers in pharmaceutical research and development in as comprehensive
or integrated manner as attempted here. This presents the editor of such a
volume with some decisions of what to include in a book of this nature when
trying to show the broadest applications of computers to pharmaceutical
research and development. It is not possible to exhaustively discuss all com-
puter applications in this area; hence there was an attempt to select topics
that may have a more immediate impact and relevance to improving the
research and development process and that may influence the present and
future generations of scientists. There are attendant historical, regulatory,
and ethical considerations of using computers and software in this industry,
and these should be considered equally alongside their applications. I have
not solicited contributions that address the role of computers in manufactur-
ing, packaging, finance, communication, and administration, areas that are
common to other industries and perhaps represent the content of a future
volume. The book is therefore divided into broad sections, although there are
certainly overlaps as some chapters could be considered to belong in more
than one section.

The intended audience for this book is comprised of students, managers,
scientists, and those responsible for applying computers in any of the areas
related to pharmaceutical research and development. It is my desire that
pharmaceutical executives will also see the wide-ranging benefits of com-
puters as their influence and impact is often not given its due place, probably
because there is always a human interface that presents the computer-
generated output. I hope this book shows the benefits for a more holistic
approach to using computers rather than the frequently observed narrowly
defined vertical areas of applications fragmented on a departmental or func-
tional basis. This book therefore describes the history, present, future applica-
tions, and consequences of computers in pharmaceutical research and
development with many examples of where computers have impacted on
processes or enabled the capture, calculation, or visualization of data that has
ultimately contributed to drugs reaching the market. Readers are encouraged
to see this broader picture of using computers in pharmaceutical research and
development and to consider how they can be further integrated into the
paradigms of the future. The whole is certainly greater than the sum of the

I hope that readers who have not used computers in their pharmaceutical
research and development roles will also feel inspired by the ideas and results
presented in the chapters and want to learn more, which may result in them
using some if not many of the approaches. It is also my hope that the vision
of this book will be realized by computers being directly associated with the
continued success of the pharmaceutical, biotechnology, and associated
industries, to ultimately speed the delivery of therapeutics to the waiting
patients. I sincerely believe you will enjoy reading and learning about the
broad applications of computers to this industry, as I have done during the
editing process. This is just a beginning of imagining them as a continuum.



I am sincerely grateful to Dr. Binghe Wang for inviting me to write a book
on a topic of my choosing; without his initiation this book might have just
remained an idea. I am appreciative of the editorial assistance, overall advice,
and patience provided by Jonathan Rose at John Wiley & Sons during the
last year. My anonymous proposal reviewers are thanked for their consider-
able encouragement and suggestions, which helped expand the scope of the
book beyond my initial outline. Dr. Jean-Pierre Wery kindly provided valu-
able suggestions for contributing authors early on, along with the many other
scientists contacted who responded by providing ideas for additional authors.
As an editor I am entirely dependent on the many authors who have con-
tributed their valuable time and effort to provide chapters for this book, given
only a brief title and an overview at the start. I thank them sincerely for
making this book possible and for sharing their enthusiasm and expertise with
a wider audience as well as bearing up with my communications through
the year. I would also like to acknowledge Rebecca J. Williams for kindly
providing artwork for the cover image.

My interest in computational approaches applied to the pharmaceutical
industry was encouraged nearly a decade ago while at Lilly Research
Laboratories by Dr. Steven A. Wrighton, Mr. James H. Wikel, and Dr. Patrick
J. Murphy. The generous collaborations with colleagues in both industry and
academia since are also acknowledged, and several of these collaborators are
contributors to this book. Two books, The Logic of Failure by Dietrich Dorner
and Serious Play by Michael Schrage, were introductions to the global uses
of computer simulation 5 years ago, which sowed the seed for considering all
the areas where computers and their simulation possibilities could be applied




in pharmaceutical research and development. I hope this book builds on the
work of the pioneers in the many fields described in the following chapters,
and I take this opportunity to acknowledge their contributions where they
are not explicitly recognized.

My family has always provided considerable support, from my first interest
in science through university to the present, even though we are now
separated by the Atlantic. I dedicate this book to them and to Maggie,
whose sustained valuable advice, tolerance of late nights and weekends at the
computer, and general encouragement has helped me to see this project
through from conception.

Sean Ekins

Jenkintown, Pennsylvania
October 2005



Mazen Abdellatif, Hines VA Cooperative Studies Program Coordinating
Center (151K), P.O. Box 5000, 5th Ave. and Roosevelt Rd. Bldg. 1, Rm.
B240, Hines, IL 60141-5000, USA. ([email protected]).

Heather Ahlborn, IBM Business Consulting Services, Pharma and Life
Sciences, Armonk, NY 10504, USA. ([email protected]).

Konstantin V. Balakin, ChemDiv, Inc. 11558 Sorrento Valley Rd., Ste. 5,
San Diego, CA 92121, USA. ([email protected]).

Kenneth Banks, Global Analytical Development, Johnson & Johnson
Pharmaceutical Research & Development (J&JPRD), 1000 Route 202,
Raritan, NJ 08869, USA.

Benoit Beck, Lilly services SA, European Early Phase Statistics, Mont Saint
Guibert, Belgium. ([email protected]).

James R. Bosley, Jr., Rosa Pharmaceuticals, Inc., P.O. Box 2700, Cupertino,
CA 95015, USA. ([email protected]).

Bruno Boulanger, Lilly services SA, European Early Phase Statistics, Mont
Saint Guibert, Belgium.

Donald B. Boyd, Department of Chemistry, Indiana University-Purdue
University at Indianapolis (IUPUI), Indianapolis, IN 46202-3274, USA.
([email protected]).

Eric J. Buenz, Complementary and Integrative Medicine Program, Mayo
Clinic College of Medicine, 200 First Street SW, Rochester, MN 55905,
USA. ([email protected]).




Andrew Chadwick, PA Consulting Group, Chamber of Commerce House
2nd floor, 75 Harborne Road, Birmingham, B15 3DH, UK. (andrew.
[email protected]).

Cheng Chang, Division of Pharmaceutics, College of Pharmacy, The Ohio
State University, Columbus, OH 43210 ([email protected]).

Elizabeth A. Colbourn, Intelligensys Ltd., Belasis Business Centre, Belasis
Hall Technology Park, Billingham, Teesside, UK.

Daniel Cronenberger, Novartis Institutes for BioMedical Research,
Lichstrasse 35, 4002 Basel, Switzerland.

Nick Davies, Pfizer Limited, Sandwich, Kent, CT13 9NJ, UK. (nicholas.
[email protected]).

John C. Dearden, School of Pharmacy and Chemistry, Liverpool John
Moores University, Byrom Street, Liverpool L3 3AF, UK. (j.c.dearden@

David J. Edwards, Accelrys Inc, 10188 Telesis Court, Suite 100, San Diego,
CA 92121, USA. ([email protected]).

Sean Ekins, 601 Runnymede Ave., Jenkintown, PA 19046. (ekinssean@

Richard Farris, Health Decisions, Inc. 6350 Quadrangle Drive, Suite 300,
Chapel Hill, NC 27517, USA. ([email protected]).

Darren R. Flower, Edward Jenner Institute for Vaccine Research,
High Street, Compton, Berkshire, RG20 7NN, UK. (darren.flower@

Andrea de Gaetano, CNR IASI Laboratorio di Biomatematica UCSC –
Largo A. Gemeli, 8-00168 Roma, Italy. ([email protected]).

Valerie J. Gillet, Department of Information Studies, University of Sheffield,
Western Bank, Sheffield S10 2TN, UK. ([email protected]).

Craig N. Giroux, Institute of Environmental Health Sciences, Wayne State
University, 2727 Second Avenue, Room 4000, Detroit, MI 48201, USA.
([email protected]).

Renée J. Goldberg Arnold, President and CEO, Arnold Consultancy &
Technology, LLC, 1 Penn Plaza, 36th Floor, New York, NY 10119.
([email protected]).

Nicolas Grandjean, Novartis Institutes for BioMedical Research, Lichstrasse
35, 4002 Basel, Switzerland.

Robert Harrison, 24IP Law Group, Herzogspitalstrasse 10a, 80331 Munich,
Germany. ([email protected]).



Stuart T. Henderson, IBM Business Consulting Services, Pharma and Life
Sciences, Armonk, NY 10504, USA. ([email protected]).

Roderick E. Hubbard, Structural Biology Laboratory, University of York,
Heslington, York, YO10 5DD, UK and Vernalis (R&D) Ltd, Granta Park,
Abington, Cambridge, CB1 6GB, UK. ([email protected], r.hubbard@

Maggie A. Z. Hupcey, PA Consulting Group, 600 College Road East, Suite
1120, Princeton, NJ 08540, USA. ([email protected]).

Michael Jones, Molecular Informatics, Triangle Molecular, 1818 Airport
Road, Chapel Hill, NC 27514, USA.

Alex Kiselyov, ChemDiv, Inc. 11558 Sorrento Valley Rd., Ste. 5, San Diego,
CA 92121, USA.

Olivier Kreim, Novartis Institutes for BioMedical Research, Lichstrasse 35,
4002 Basel, Switzerland.

Hugo Kubinyi, Donnersbergstrasse 9, D-67256 Weisenheim am Sand,
Germany. ([email protected]).

Christophe G. Lambert, Golden Helix Inc., P.O. Box 10633, Bozeman, MT
59719, USA. ([email protected]).

Jonathan G. Levine, Office of Post-marketing and Statistical Science Imme-
diate Office, Center for Drug Evaluation and Research, Food and Drug
Administration; Rockville, MD 20857, USA.

Weiyong Li, Global Analytical Development, Johnson & Johnson Pharma-
ceutical Research & Development (J&JPRD), 1000 Route 202, Raritan,
NJ 08869, USA. ([email protected]).

Max M. Marsh, Department of Chemistry, Indiana University-Purdue
University at Indianapolis (IUPUI), Indianapolis, IN 46202-3274,

Matthew K. McGowan, Business Management & Administration, Bradley
University, Peoria, IL 61625, USA. ([email protected]).

Richard J. McGowan, Philosophy and Religion Department, Butler Univer-
sity, Indianapolis, IN 46208, USA. ([email protected]).

Jonathan Moore, PA Consulting Group, One Memorial Drive, Cambridge,
MA 02142, USA.

Alfred Nehme, CambridgeSoft Corporation, 100 Cambridgepark Drive,
Cambridge, MA 02140, USA. ([email protected]).

Simona Panunzi, CNR IASI Laboration di Biomatematica UCSC – Largo
A. Gemeli, 8-00168 Roma, Italy.



Pierre Parisot, Novartis Institutes for BioMedical Research, Lichstrasse 35,
4002 Basel, Switzerland.

Anthony Parsons, 3 Harkness Drive, Canterbury CT2 7RW, UK. (tony@

Manuel C. Peitsch, Novartis Institutes for BioMedical Research, Lichstrasse
35, 4002 Basel, Switzerland. ([email protected]).

Robin Purshouse, PA Consulting Group, 19 York Street, Manchester,
M23BA, UK.

Martin Romacker, Novartis Institutes for BioMedical Research, Lichstrasse
35, 4002 Basel, Switzerland.

Michael Rosenberg, Health Decisions, Inc. 6350 Quadrangle Drive, Suite
300, Chapel Hill, NC 27517, USA. ([email protected]).

Raymond C. Rowe, The PROFITS Group, Institute of Pharmaceutical Inno-
vation, University of Bradford, Bradford, West Yorkshire BD7 1DP, UK.
([email protected]).

Nikolay P. Savchuk, ChemDiv, Inc., 11558 Sorrento Valley Rd., Ste. 5,
San Diego, CA 92121, USA.

Robert A. Scoffin, CambridgeSoft Corporation, 8 Signet Court, Swann’s
Road, Cambridge, CB5 8LA, UK. (rscoffi [email protected]).

Richard K. Scott, Walk House, Church Lane, Chatteris, Cambridgeshire,
PE16 6JA, UK. ([email protected]).

Jun Shimada, Columbia Universiy, Department of Chemistry, 3000 Broad-
way, New York, NY 10032, USA. ([email protected], shimadajun@

Peter W. Swaan, Department of Pharmaceutical Sciences, University of
Maryland, 20 Penn Street, Baltimore, MD 21201, USA. (pswaan@rx.

Ana Szarfman, Office of Post-marketing and Statistical Science Immediate
Office, Center for Drug Evaluation and Research, Food and Drug
Administration, Rockville, MD 20857, USA. (SZARFMAN@

Joseph M. Tonning, Office of Post-marketing and Statistical Science Imme-
diate Office, Center for Drug Evaluation and Research, Food and Drug
Administration; Rockville, MD 20857, USA.

Thérèse Vachon, Novartis Institutes for BioMedical Research, Lichstrasse
35, 4002 Basel, Switzerland.

Paolo Vicini, Resource Facility for population kinetics, Room 241 AERL
Building, Depatment of Bioengineering, Box 352255, University of
Washington, Seattle, WA 98195-2255, USA. ([email protected]).



Sandy Weinberg, Fast Trak Vaccines, GE Healthcare 5348 Greenland Road,
Atlanta, GA 30342, USA. ([email protected]).

David J. Wild, Indiana University School of Informatics, 1900 E. Tenth
Street, Bloomington, IN 47406, USA. ([email protected]).

Peter Willett, Department of Information Studies, University of Sheffield,
Western Bank, Sheffield S10 2TN, UK. ([email protected]).

S. Stanley Young, CGStat, L.L.C., 3401 Caldwell Drive, Raleigh, NC 27607,
USA. ([email protected]).

Weifan Zheng, Cheminformatics Research Resources, Division of Medicinal
Chemistry, School of Pharmacy, University of North Carolina at Chapel
Hill, NC 27599-7360, USA. ([email protected]).








Donald B. Boyd and Max M. Marsh


1.1 Introduction
1.2 Computational Chemistry: the Beginnings at Lilly
1.3 Germination: the 1960s
1.4 Gaining a Foothold: the 1970s
1.5 Growth: the 1980s
1.6 Fruition: the 1990s
1.7 Epilogue



Today, computers are so ubiquitous in pharmaceutical research and develop-
ment that it may be hard to imagine a time when there were no computers to
assist the medicinal chemist or biologist. A quarter-century ago, the notion
of a computer on the desk of every scientist and company manager was not
even contemplated. Now, computers are absolutely essential for generating,
managing, and transmitting information. The aim of this chapter is to give a

Computer Applications in Pharmaceutical Research and Development, Edited by Sean Ekins.
ISBN 0-471-73779-8 Copyright © 2006 John Wiley & Sons, Inc.




brief account of the historical development. It is a story of ascendancy and
one that continues to unfold.

Owing to the personal interest and experience of the authors, the emphasis
in this chapter is on using computers for drug discovery. But the use of com-
puters in laboratory instruments and for analysis of experimental and clinical
data is no less important. This chapter is written with young scientists in mind.
We feel it is important that the new investigator have an appreciation of how
the field evolved to its present circumstance, if for no other reason than to
help steer toward a better future for those scientists using or planning to use
computers in the pharmaceutical industry.

Computers began to be deployed at pharmaceutical companies as early as
the 1940s. These early computers were usually for the payroll and for account-
ing, not for science. Pharmaceutical scientists did eventually gain access to
computers, if not in the company itself, then through contractual agreements
with nearby educational institutions or other contractors.

There were several scientific and engineering advances that made possible
a computational approach to what had long been exclusively an experimental
art and science, namely, discovering a molecule with useful therapeutic
potential. One fundamental concept understood by chemists was that chemi-
cal structure is related to molecular properties including biological activity.
Hence if one could predict properties by calculations, one might be able to
predict which structures should be investigated in the laboratory. Another
fundamental, well-established concept was that a drug would exert its bio-
logical activity by binding to and/or inhibiting some biomolecule in the
body. This concept stems from Fischer’s famous lock-and-key hypothesis
(Schlüssel-Schloss-Prinzip) [1, 2]. Another advance was the development of
the theory of quantum mechanics in the 1920s [3]. This theory connected
the distribution of electrons in molecules with observable molecular proper-
ties. Pioneering research in the 1950s attacked the problem of linking elec-
tronic structure and biological activity. A good part of this work was collected
in the 1963 book by Bernard and Alberte Pullman of Paris, France, which
fired the imagination of what might be possible with calculations on biomol-
ecules [4]. The earliest papers that attempted to mathematically relate chem-
ical structure and biological activity were published in Scotland all the way
back in the middle of the nineteenth century [5, 6]. This work and a couple
of other papers [7, 8] were forerunners to modern quantitative structure-
activity relationships (QSAR) but were not widely known. In 1964, the role
of molecular descriptors in describing biological activity was reduced to a
simplified mathematical form, and the field of QSAR was propelled toward
its modern visage [9, 10]. (A descriptor is any calculated or experimental
numerical property related to a compound’s chemical structure.) And, of
course, there was the engineering development of computers and all that
entailed. The early computers were designed for military and accounting
applications, but gradually it became apparent that computers would have a
vast number of uses.



One of us (MMM) was one of the first people in the pharmaceutical
industry to perceive that computer-aided drug design was something that
might be practical and worthy of investigation. He pioneered a sustained,
industrial research program to use computers in drug design. After retiring
from Eli Lilly and Company in 1986, he became a Visiting Research Scientist
and later an Adjunct Professor in the Department of Chemistry, Indiana
University, Bloomington. Section 1.2 is his personal account of the early
steps at Lilly.


This narrative was first presented at Don Boyd’s third annual Central Indiana
Computational Chemistry Christmas Luncheon (CICCCL-3) on December
18, 1997. Although it is specific for Eli Lilly and Company, the progress and
problems that transpired there were probably not too different from develop-
ments at other large, forward-looking, research-based pharmaceutical

This little story contains mainly my personal recollection about how the com-
putational chemistry project in the Lilly Research Laboratories began. An
advantage of living longer than one’s contemporaries is that there is no one
around among the early participants to contradict my reminiscences. A more
comprehensive history of this discipline may be found in the Bolcer and
Hermann chapter in Reviews of Computational Chemistry [11]. I shall confine
this commentary to what I remember about my own involvement.
I began work at Eli Lilly and Company in March 1942 as a laboratory aide in
the analytical department. At that time, there was very little sophisticated
instrumentation in the laboratory. The most complex calculations were carried
out using a slide rule. After military service in World War II and an educational
leave of absence to complete my undergraduate studies in chemistry at Indiana
University, I returned to the Lilly analytical group in 1947. Slide rules were still
much in evidence but were soon augmented with mechanical calculators—
usually Monroe or Friden models.
It was not until 1949 that the company actually acquired a stored-program
computer; at that time an IBM 704 system was purchased—for about $1 million.
In spite of the fact that it was a vacuum tube machine—with considerable con-
comitant downtime—several business operations were carried out using it. A
number of inventories and the payroll were successfully handled. However, no
scientific calculations were performed with it. The system was replaced in a few
years with an IBM 709—again only for business and financial operations.
In the late 1950s or early 1960s, the first computers to have stored programs of
scientific interest were acquired. One of these was an IBM 650; it had a rotating
magnetic drum memory consisting of 2000 accessible registers. The programs,
the data input, and the output were all in the form of IBM punched cards. A
major concern was keeping those card decks intact and in order as they were



moved about from user to machine and back. My recollection is that some sta-
tistical calculations by Lilly’s research statistics group under Dr. Edgar King
were carried out on this machine.

At about the same time, one of the business groups obtained an IBM 610 com-
puter. This device was simpler to use than the 650, and it utilized punched paper
tape input, program, and output. The tape was generated on a typewriter. Pro-
grams were developed using an essentially algebraic language peculiar to the
machine. After the program tape was read in, a tape containing sequentially
the data to be processed was fed in. The output tape was carried back to a tape
reader linked to a typewriter where the results were ultimately typed out. I used
this machine.

My interest at that time revolved around evaluating optical rotary dispersion
data [12]. The paired values of optical rotation vs. wavelength were used to fit
a function called the Drude equation (later modified to the Moffitt equation
for William Moffitt [Harvard University] who developed the theory) [13]. The
coefficients of the evaluated equation were shown to be related to a significant
ultraviolet absorption band of a protein and to the amount of alpha-helix con-
formation existing in the solution of it.

Interest in possible applications of computers at the Lilly Research Laboratories
began to broaden in the early 1960s. Dr. King (then director of the statistical
research group) and I appeared before the Lilly board of directors, submitting a
proposal to acquire a computer and ancillary equipment to be devoted primarily
to research needs. The estimated cost was a little more than $250,000. In those
days, an expenditure of that large an amount required board approval. Today, I
suppose a division director or even a department head could sign off on a personal
computer with vastly more power than any computer of the 1960s!

The board of directors approved our proposal. The system that was purchased
was an IBM 1620 with the necessary card punches and reader plus tape drives.
In addition to statistical and some analytical chemistry applications, Dr. Charles
Rice (then head of the radiochemistry group) and I initiated Lilly’s first com-
puter-based information retrieval system. Through an agreement with the Insti-
tute for Scientific Information (ISI, Philadelphia), Lilly was able to receive
magnetic tapes containing computer-searchable title information on current
scientific journals from ISI every 10 days. Interest profiles of individual Lilly
scientists were then used to generate the famous (or infamous!) “hit” cards that
were distributed to members of the research staff. The cards contained journal
citations to articles matching the recipient scientist’s profile. This service con-
tinued until the advent of electronic literature alerts in the 1980s.

Stemming from my growing interest in and enthusiasm for the potential use of
computed values of atomic and molecular properties in pharmaceutical research,
I was able to gain approval for a requisition for a scientist who knew how to use
computers to determine molecular properties. The person I hired was Dr. Robert
B. Hermann, our first theoretical chemist. It was 1964. He obtained his Ph.D.
with Prof. Norman L. Allinger at Wayne State and then did postdoctoral research
with Prof. Joseph O. Hirschfelder at Wisconsin and with Prof. Peter Lykos at
the Illinois Institute of Technology. When Bob joined us, he brought along a
semiempirical molecular orbital program that he had personally written. He



planned to use this to estimate molecular properties of drug-type molecules, but
Lilly computers were incapable of handling the necessary matrix multiplication
steps. This obstacle was overcome by going outside the company. We were able
to develop a working agreement with the engineering component of Allison
Transmission Division of General Motors to use their IBM 7094 after regular
working hours. Since the system was used only by Allison and Lilly, data security
was not an issue. However, considerable time was spent transporting punched
card decks and printouts between the Lilly Research Laboratories near down-
town Indianapolis and the Allison facility in nearby Speedway, Indiana.

Looking back, it is difficult for me to pinpoint the factors leading to my initia-
tion of the molecular modeling and drug design effort at Lilly. Certainly, the
developments of Prof. Lou Allinger and his associates (at Wayne State and the
University of Georgia) in the 1960s to use calculations to study conformation
played an important part [14]. Similarly, the publishing of an EHT program by
Prof. Roald Hoffmann (Harvard University) in 1963 was a significant impetus.
The introduction of the pi-sigma correlation equation by Prof. Corwin Hansch
(Pomona College) in 1964 added another facet of interest. Also that year, Dr.
Margaret Dayhoff (a theoretical chemist who became the first prominent
woman in what would become the fi eld of bioinformatics and who was at the
National Biomedical Research Foundation in Maryland) published a method
for arriving at the geometry of a polypeptide or protein via internal coordinates
[15]. This methodology also encouraged me to begin thinking about enzyme-
inhibitor interactions and the three-dimensional requirements for molecular

It was not until 1968, when Don Boyd joined us as the second theoretical
chemist in our group, that the computers at Lilly started to reach a level of size,
speed, and sophistication to be able to handle some of the computational
requirements of our various evaluation and design efforts. Don brought with
him Hoffmann’s EHT program from Harvard and Cornell. Due to the length
of our calculations and due to the other demands on the computer, the best we
could obtain was a one-day turnaround.

The preceding years involved not only the Allison agreement (for which we paid
a modest fee) but also later ones with Purdue University (West Lafayette, Indiana)
and Indiana University, Bloomington computing centers. These latter arrange-
ments involved Control Data Corporation (CDC) systems that were much faster
than the IBM 7094. Use of the Purdue computer, which continued after Don
joined our group, involved driving to the near north side of Indianapolis where
the Purdue extension campus was located. In the basement of their science build-
ing was a computer center connected to the CDC 7600 in West Lafayette. Com-
puter card decks of data and the associated program for approximate molecular
orbital calculations could be left with the machine operators. With luck, the card
decks and computer printouts could be retrieved the next day. Security was more
of a problem with the academic facilities because they had a large number of
users. The concern was enhanced when—on one occasion—I received, in addi-
tion to my own output, the weather forecast data and analysis for the city of
Kokomo, Indiana! Even though it was unlikely that anyone could make use of our
information except Bob, Don, or myself, it was a relief to research management
when we were able to carry out all our computations in-house.



These reminiscences cover about the first 15 years of the Lilly computational
chemistry effort. Considering the strong tradition of lead generation emanating
from the organic chemistry group, the idea that molecular modeling could make
a significant contribution to drug design was slow to be accepted. Nevertheless,
enough research management support was found to spark the small pioneering
project and to keep it going in the face of strong skepticism. Regrettably, a
considerable amount of my time in this critical period was spent attempting to
convince management and the scientific research staff of the logic and signifi –
cance of these studies. Because we entered the field at a very early stage, a great
deal of effort went into the testing, evaluation, and establishment of the limits
of application of the various computational methods. This kind of groundwork
was not always well understood by the critics of our approach.

In what follows, we review events, trends, hurdles, successes, people, hard-
ware, and software. We attempt to paint a picture of happenings as histori-
cally correct as possible but, inevitably, colored by our own experiences and
memories. The time line is broken down by decade from the 1960s through
the turn of the century. We conclude with some commentary on where the
field is headed and lessons learned. For some of the topics mentioned, we
could cite hundreds of books [16] and thousands of articles. We hope that
the reader will tolerate us citing only a few examples. We apologize to our
European and Japanese colleagues for being less familiar with events at their
companies than with events in the United States. Before we start, we also
apologize sincerely to all the many brilliant scientists who made landmark
contributions that we cannot cover in a single chapter.


We can state confidently that in 1960 essentially 100% of the computational
chemists were in academia, not industry. Of course, back then they were not
called computational chemists, a term not yet invented. They were called
theoretical chemists or quantum chemists. The students coming from those
academic laboratories constituted the main pool of candidates that industry
could hire for their initial ventures into using computers for drug discovery.
Another pool of chemists educated using computers were X-ray crystallogra-
phers. Some of these young theoreticians and crystallographers were inter-
ested in helping solve human health challenges and steered their careers
toward pharmaceutical work.

Although a marvel at the time, the workplace of the 1960s looks archaic
in hindsight. Computers generally resided in computer centers, where a small
army of administrators, engineers, programming consultants, and support
people would tend the mainframe computers then in use. The computers were
kept in locked, air-conditioned rooms inaccessible to ordinary users. One of
the largest computers then in use by theoretical chemists and crystallogra-
phers was the IBM 7094. Support staff operated the tape readers, card readers,



and printers. The users’ room at the computer centers echoed with the clunk-
clunk-clunk of card punches that encoded data as little rectangular holes in
the so-called IBM cards [see reference 11]. The cards were manufactured in
different colors so that users could conveniently differentiate their many card
decks. As a by-product, the card punches produced piles of colorful rectan-
gular confetti. There were no Delete or Backspace keys; if any mistake was
made in keying in data, the user would need to begin again with a fresh blank
card. The abundance of cards and card boxes in the users’ room scented the
air with a characteristic paper smell. Programs were written in FORTRAN
II. Programs used by the chemists usually ranged from half a box to several
boxes long. Carrying several boxes of cards to the computer center was good
for physical fitness. If a box was dropped or if a card reader mangled some
of the cards, the tedious task of restoring the deck and replacing the torn
cards ensued. Input decks were generally smaller—consisting of tens of
cards—and were sandwiched between JCL (job control language for IBM
machines) cards and bound by rubber bands. Computer output usually came
in the form of ubiquitous pale green and white striped paper (measuring 11
by 14-7/8 inches per page). Special cardboard covers and long nylon needles
were used to hold and organize stacks of printouts.

Mathematical algorithms for common operations such as matrix diagonal-
ization had been written and could be inserted as a subroutine in a larger
molecular orbital program, for instance. Programs for chemistry were gener-
ally developed by academic groups, with the graduate students doing most or
all of the programming. Partly, this was standard practice because the profes-
sors at different universities were in competition with each other and wanted
a better program than their competitors had access to. (Better means running
faster, handling larger matrices, and doing more.) Partly, this situation was
standard practice so that the graduate students would learn by doing. Obvi-
ously, this situation led to much duplication of effort: the proverbial reinvent-
ing the wheel. To improve this situation, Prof. Harrison Shull and colleagues
at Indiana University, Bloomington, conceived and sold the concept of having
an international repository of software that could be shared. Thus was born
in 1962 the Quantum Chemistry Program Exchange (QCPE). Competitive
scientists were initially slow to give away programs they worked so hard to
write, but gradually the depositions to QCPE increased. We do not have room
here to give a full recounting of the history of QCPE [17], but suffice it to say
that QCPE proved instrumental in advancing the field of computational
chemistry including that at pharmaceutical companies. Back in the 1960s and
1970s, there were no software companies catering to the computational chem-
istry market, so QCPE was the main resource for the entire community. As
the name implies, QCPE was initially used for exchanging subroutines and
programs for ab initio and approximate electronic structure calculations. But
QCPE evolved to encompass programs for molecular mechanics and a wide
range of calculations on molecules. The quarterly QCPE Newsletter (later
renamed the QCPE Bulletin), which was edited by Mr. Richard W. Counts,



was for a long time the main vehicle for computational chemists to announce
programs and other news of interest. QCPE membership included industrial
computational chemists.

Finally in regard to software, we note one program that came from the
realm of crystallography. That program was ORTEP (Oak Ridge Thermal
Ellipsoid Program), which was the first widely used program for (noninterac-
tive) molecular graphics [18]. Output from the program was inked onto long
scrolls of paper run through expensive, flat-bed printers. The ball-and-stick
ORTEP drawings were fine for publication, but routine laboratory work was
easier with graph paper, ruler, protractor, and pencil to plot the Cartesian
coordinates of a molecule the chemist wanted to study. Such handmade draw-
ings quantitated molecular geometry. Experimental bond lengths and bond
angles were found in a British compilation [19].

Also to help the chemist think about molecular shape, hand-held molecu-
lar models were widely used by experimentalists and theoreticians alike.
There were two main types. One was analogous to stick representations in
which metal or plastic rods represented bonds between atoms, which were
balls or joints that held the rods at specific angles. Metal wire Drieding
models were among the most accurate and expensive. The other type was
space filling. The expensive CPK (Corey–Pauling–Koltun) models [20, 21]
consisted of three-dimensional spherical segments made of plastic that were
color-coded by element (white for hydrogen, blue for nitrogen, red for oxygen,
etc.). From this convention, came the color molecular graphics we are familiar
with today.