Volume 104
Number 4
Winter 2018
Journal of the
MCZ LIBRARY
APR 1 4 2019
ACADEMY OF SCIENCES = papyagounivessiry
WASHINGTON
Oa aes RAE MAUR T CANES Ted ULC TI Coe err ao ics cas ac oes ae pu chives coin ea hvadis vend vee casvbtincnabnccesteduente ii
Bet NaN RRR NaN AMIR ECAR E CNS eo os cacy eso sede pinsvspeckdasvsvennontcuscdsiprudinsvengsessenstnseivantoneconss iil
Administrative Vice President Report 7. LOMgstreth. o.......cccccsecssssssesseecsseesneesessseecneesnneennecenes 1
DAeONS GATE G (SMAG ES es ea ee a ee vi
Determine Bullet Trajectory Reconstruction in se PETAR ota 5G ar hasten aaa anerpsdascy Psat 14
Reusable Models of Manufacturing Processes Vf. Mani 0 CL. ...e.ccecscecsesssesseesseesseeseeseenes 21
Generating Domain Terminologies J. Collard et Ol. ......ccceccecssesseessesssesssesneesseesesreesneesneenneenees 31
aa a Sct PMMA MES SEN IN AR RES EN etc ote M Tora ck asec ek rnc a /ncny se csnvevcsn secon cancbnbncoennsaos sueipesadedantesivacstoued 19
Sane RNIN RR NEMS GIN aos og so ch vcs db apna vacdSagssneskesttbasbspeaidbutvisatichaassunNiadouteonss 20
Ne is NAAN RN MECN De eg cece ct arn 8c vavbnn vtec cscndecobnanscesoroldnspdbendbuctBerovectinse 21
a an OR TO gag era argc cite 20cko doce las ohesccsnbsvonedvcasdeubdovnnssacennvtnesheass
fog y eS a ePPLL PeESWS 7G fe [Ilo fe] oP 2 a Gn
ISSN 0043-0439 Issued Quarterly at Washington DC
Washington Academy of Sciences
Founded in 1898
BOARD OF MANAGERS
Elected Officers
President
Mina Izadjoo
President Elect
Judy Staveley
Treasurer
Ronald Hietala
Secretary
Lynnette Madsen
Vice President, Administration
Terry Longstreth
Vice President, Membership
Ram Sriram
Vice President, Junior Academy
Paul Arveson
Vice President, Affiliated Societies
Gene Williams
Members at Large
Michael Cohen
Frank Haig, S.J.
Mahesh Mani
Kathe C. Brady
Elizabeth Doyle
Past President
Sue Cross
AFFILIATED SOCIETY DELEGATES
Shown on back cover
Editor of the Journal
Sethanne Howard
Journal of the Washington Academy of
Sciences (ISSN 0043-0439)
Published by the Washington Academy of
Sciences
email: wasjournal@washacadsci.org
website: www.washacadsci.org
The Journal of the Washington Academy
of Sciences
The Journal is the official organ of the
Academy. It publishes articles on science
policy, the history of science, critical reviews,
original science research, proceedings of
scholarly meetings of its Affiliated Societies,
and other items of interest to its members. It
is published quarterly. The last issue of the
year contains a directory of the current
membership of the Academy.
Subscription Rates
Members, fellows, and life members in good
standing receive the Journal free of charge.
Subscriptions are available on a calendar year
basis, payable in advance. Payment must be
made in US currency at the following rates.
US and Canada $30.00
Other Countries $35.00
Single Copies (when available) $15.00
Claims for Missing Issues
Claims must be received within 65 days of
mailing. Claims will not be allowed if non-
delivery was the result of failure to notify the
Academy of a change of address.
Notification of Change of Address
Address changes should be sent promptly to
the Academy Office. Notification should
contain both old and new addresses and zip
codes.
Postmaster:
Send address changes to WAS, Rm GL117,
1200 New York Ave. NW
Washington, DC 20005
Academy Office
Washington Academy of Sciences
Room GL117
1200 New York Ave. NW
Washington, DC 20005
Phone: (202) 326-8975
Volume 104
Number 4
Winter 2018
Journal of the
WASHINGTON
ACADEMY OF SCIENCES
Editor's Comments 5S. Howard
ARCH OP ISI TTI” ECUINOUS c.tnes sc as Std haste ised dA mentees ede As iii
Administrative Vice President Report 7. LONGStreth. .....c.cccccccccccscccssessessesessesseseesesseneesesveseseeveens 1
Pau ARMINIA DN decease tes 7 locece Madera end euacens oh cong tan Rid d egeae tad apts easier as sel ta cate eanpeahiee meee ig
Determine Bullet Trajectory Reconstruction L. CHANCE. cocccccccessecssesesssestecsscssessssseentesseeneens 11
Reusable Models of Manufacturing Processes M. Mani et 1. o..ccccceccccccceccsssessessesseeseeeeees 21
Generating Domain Terminologies J. Collard et Al. ....ceeccccccccccsssessssessesseeseesseseessesesseenseucenveens 31
yuleailessedil)emy ats) o] (res) | (o/| en emenMner coc tran crsMrcr aerated eee. 6 ore res)
BERGEN URE REIN RIN EN CRUSH a hon fe ccctonecec cag spavigus ar eres hv easds A raglan tortor sete rcautanttia bsgealleiy aay isamacasietese 80
PUL RU ATCC RT MRA TE PECL ENCORE etre tAeot oer orsocsoscc reer tlewetl svete ome Rerens steers est resets snsaibast ea tovelnseentntoat gt vebsassa 81
leer Shiles BISWA MLE P eect IE NOs. win bdininien dim wnnaiorememaune
Atiliated SOCIGIOS ANG DElGGAtES oi oiisnnteniccsercncineiin tstsounnns sterilised
ISSN 0043-0439 Issued Quarterly at Washington DC
Spring 2018
EDITOR’S COMMENTS
Presenting the 2018 Winter issue of the Journal of the Washington Academy
of Sciences.
For this issue we have our first (since I can remember) letter to the
editor. I encourage people to write letters to the editor. Please send email
(wasjournal@washacadsci.org) comments on papers, suggestions for
articles, and ideas for what you would like to see in the Journal.
We start with a column by one of our Board members: the
Administrative Vice President. This is a good addition to the Journal and
informative as well. Perhaps such columns can become a regular part of the
Journal.
To follow is a book review of Accessory to War: The Unspoken
Alliance between Astrophysics and the Military authors Neil de Grasse
Tyson and Avis Lang.
Next up is a student paper by Lydia Chance from Frederick
Community College. We encourage student papers and help the student to
learn about writing a scientific paper.
Then a two multi-author papers: one on computer search engines for
natural language documents, the other on reusable models of manufacturing
processing.
Every winter we print a list of members and addresses. Please check
to see that you are listed correctly. The Academy covers the greater
Washington DC area including parts of Virginia and Maryland. Most of our
members live in Maryland.
The Journal is the official organ of the Academy. Please consider
sending in technical papers, review studies, announcements, and book
reviews.
We are a peer reviewed journal and need volunteer reviewers. If you
would like to be on our reviewer list please send email to the above address
and include your specialty.
Sethanne Howard
Washington Academy of Sciences
iil
Journal of the Washington Academy of Sciences
Editor Sethanne Howard showard@washacadsci.org
Board of Discipline Editors
The Journal of the Washington Academy of Sciences has an 11-member
Board of Discipline Editors representing many scientific and technical
fields. The members of the Board of Discipline Editors are affiliated with a
variety of scientific institutions in the Washington area and beyond —
government agencies such as the National Institute of Standards and
Technology (NIST); universities such as Georgetown; and professional
associations such as the Institute of Electrical and Electronics Engineers
(IEEE).
Anthropology
Astronomy
Biology/Biophysics
Botany
Chemistry
Emanuela Appetiti eappetiti@hotmail.com
Sethanne Howard sethanneh@msn.com
Eugenie Mielczarek mielczar@physics.gmu.edu
Mark Holland maholland@salisbury.edu
Deana Jaber djaber@marymount.edu
Environmental Natural
Sciences
Health
History of Medicine
Operations Research
Science Education
Systems Science
Terrell Erickson terrell.erickson] @wdc.nsda.gov
Robin Stombler rstombler@auburnstrat.com
Alain Touwaide atouwaide@hotmail.com
Michael Katehakis mnk@rci.rutgers.edu
Jim Egenrieder jim@deepwater.org
Elizabeth Corona elizabethcorona@gmail.com
Spring 2018
Letters to the Editor
From: Jeff Bullard, Fellow, WAS
| was disappointed in reading article contributed by C. Sluzki entitled “The
Impact of Authoritarian Regimes”, in Volume 104, Issue 3, pp. 11-18. That
article contains, among other disturbing passages, the following paragraph
at the top of p. 14:
These are worrisome times. Far right, ethnic-nationalist, populist, racist,
sexist, anti-immigrants (sic), anti-abortion rights, anti-ecological, anti-free
speech, post-facts (post-truth!), authoritarian candidates and governments
are gaining strength world-wide. We are facing a world being
progressively seized by charismatic leaders who may not yet be tyrants with
a simplified polarizing discourse capable of perpetrating enormous
evil. And, even while many of these ideologies didn't triumph electorally--
-as happened in some European countries---the effect of their rise has been
that majority of the center parties have moved several inches toward social
intolerance, as a way of capturing a portion of the electorate attracted by
those polarizing discourses.
| am troubled that the author introduced this kind of explicit, inflammatory,
and highly subjective political bias which compromises the veracity of the
rest of the article. Even a cursory survey of modern world history
demonstrates that authoritarian regimes, the ostensible subject of the paper,
do not arise from one particular political ideology as the author asserts. Are
there no far-left regimes that trouble the author? Are far-right governments
the only ones that are anti-free speech or anti-ecological? Are there no far-
left regimes to be found with a tinge of authoritarianism, or is the author
simply untroubled by far-left tactics? Both here and in his earlier “even
more personal vignette” on p. 13, the author reveals a significant bias that
would seriously undermine any attempt at analysis (if there had been any
actual scientific analysis) in the article. | hope I am not the only reader who
thinks that Dr. Sluzki’s article is unfitting content for a journal committed
to scientific discourse instead of sensationalistic political opinions.
Washington Academy of Sciences
Response: Carlos E. Sluzki, MD, Fellow, WAS
I truly appreciate Dr. Bullard’s comments: Criticism is more generous and
constructive than silence!
Dr. Bullard is right in his first point: I could, and perhaps should, have
omitted the words ‘right wing” from my article (or at least added a footnote
making “also left-wing dictatorships” explicit). It may have then avoided
the assumption that, by focusing on right-wing hegemonies, | was
condoning left-wing ones. I do not. In fact, I agree that, to a greater or lesser
degree, the over-inclusive epithets “ethnic-nationalist, racist, sexist, anti-
immigrant, anti-ecological, anti-free speech, post-fact, authoritarian” may
describe traits of both ends of the political spectrum. While not justifying
my omission, | explain it by the fact that during this past few years there
hasn’t been, to my knowledge, any upsurge of left-wing political extremism
that fit those attributes (with the possible exception of the political scenario
of a couple of former USSR republics, and a few governments in the process
of collapsing, such as Venezuela). In contrast, there has been a notable
expansion of right-wing! populism? both in the Americas (the U.S.A.,
Brazil) and in Europe (noticeably Austria, Belgium, Denmark, France, Italy,
Norway, Switzerland,’ and in a more extreme fashion, Hungary and
Poland.) Not all of these movements are in control of their country’s
government —the exception being the last two mentioned—but they have
grown remarkably, and dangerously (bringing once again into this discourse
an opinion, albeit fed by the lessons of history, and shared by many (e.g.,
Wodak, 2015.)
The other issue bought forth by Dr. Bullard, namely, whether a scientific
journal should tolerate opinions, is another matter. It echoes a spurious
| Right-wing: Defined as an ideology that accepts and supports a system of social
hierarchy or social inequality, with a strong anti-immigrant rhetoric and, broadly
speaking, supporting curtail of the role of the state, and supporting a neoliberal economy.
Carlisle, R.P. (2005) Encyclopedia of politics: the left and the right, Volume 2.
University of Michigan; Sage Reference. p.693 &721
2 Populism is described by the Cambridge Dictionary as ‘political ideas and activities that
are intended to get the support of ordinary people by giving them what they want’. It
includes the usage label ‘mainly disapproving’. https://www.cam.ac.uk ‘news/populism-
revealed-as-2017-w ord-of-the-year-by-cambridge-university-press .
3 Datasets: Austrian Legislative Election; Swiss Federal Election, 2011; Norwegian
Parliamentary Election, 2013; Belgian Federal Election, 2011; Danish General
Election,2011. In European Election Database. Web 6 Nov.2013. &
https://en.wikipedia.org/wiki/2018 Italian general election
Spring 2018
v1
by]
territorial dispute about the legitimacy of the use of the label “science’
between mathematic-based “hard” and socio-behavioral (and philosophy,
and history, and...) “softer” disciplines, and between science and the
common language (see, e.g., Bertrand Russell, 1958.) Should we erect tall,
beautiful walls between scientific fields, arguing the impurity or
dangerousness of our neighbors, or assume that there are gray zones
between provinces of the field of sciences where rigor and imagination
combine in fuzzy ways, to everybody’s benefit? “If scientific values
recognize a plurality of perspective, freedom of expression and political
negotiation beyond the alliances of the powerful, they would fit with the
values of liberal democracy. But the banner of ‘scientific values’ could
equally be raised by an authoritarian technocracy, in which tacit and
indigenous knowledge is marginalized.” (Hulme, 2009, p.702)
Science is not “out there,” untouched by the values of the scientist and
his/her times. Scientific journals can and should have values visible in their
pages.
REFERENCES
Russell, B. (1958): “The Divorce between Science and ‘Culture’”. An address
delivered on receiving the Kalinga Prize for the Popularization of Science at
UNESCO Headquarters on 28 January 1958. Transcribed in The UNESCO
Courier Dec.2001, p.33.
Hulme, M (2009): What does applying 'scientific values' mean in reality?
Nature 458:702
Wodak, R. (2015): The politics of fear: What right-wing populist discourses
mean. London & Los Angeles, Sage
Washington Academy of Sciences
AAAS Building
1200 New York Ave NVI
Suite GL117 Comments from
Washington, DC 200 the Vice
202-326-8975 President for
Terry Longstreth A ministrative
VP Administration
Affairs
adm in@washacadsci.org
In January of 2016, I became Vice President for Administrative Affairs
for the Washington Academy of Sciences. I spent the first year of my
incumbency trying to learn the job and to understand how I would fit
into the operational framework of the academy. Although I started with a
review of the WAS Bylaws, the Academy has a longstanding annual
business cycle and I was inserted into it at about its midpoint. So while I
studied the Bylaws, I had to, perforce, keep the wheels turning (while I
pumped up the tires, so to speak).
To paraphrase Article | of the Bylaws, the two primary purposes of the
Academy are to
e promote the interests of science (small ‘s’, i.e., not the magazine,
although we are grateful to the AAAS and that magazine for
their support) in Washington D.C. and its environs, and
e to provide for information sharing and cooperative activities
among the members and affiliated societies of the Academy.
Both purposes are only indirectly influenced by our current
operational environment, as reflected in the tools and procedures (and
the people, volunteers all) we have at our disposal for orienting the WAS
to achieve the goals implied by those purposes.
Furthermore, it has become clear over time that the operational
environment is anachronistic and not particularly responsive to changes
in the Washington science community. I’m in no position to direct or
steer the WAS (and it’s not my job to do so), but I do hope to make the
WAS Administration and its actions more visible to DC science in
general as well as to our membership and affiliates. In the process,
Winter 2018
N
perhaps the WAS will become more engaged and engaging to the
science related organizations and individuals in our local
MetropolitanStatisticalArea.|
To get started toward this goal, and as titular operations director both
of the Academy and of the Journal of the WAS, it seemed appropriate
that I try my hand writing a column for the Journal. My current intention
is to do this every quarter, but there’s no telling how that intention may
swerve over time. Other options are manifold: this space could be used
for guest essays, or perhaps other members of the Board of Managers
(“the BOM”) will offer contributions. Certainly, the Journal itself would
welcome offerings from the WAS membership. Ultimately this may lead
nowhere, but I hope not.
My job is described in the Bylaws under Article III. To summarize
that Article:
The Admin VP
e is 3 in rank in the Board of Managers, after the President and
President-elect, and presides over Board of Managers meetings
when the President and President-elect are unavailable;
e manages the business office and is responsible for business
operations of the Academy and the Journal;,.
e oversees the Office Manager and Editorial Advisory Committee;
e absent someone specifically appointed to the role, acts as
Archivist to maintain the historical records of the Academy.
That’s pretty much what the Bylaws say, and the BOM tries to
follow those rules. Overtime, the world has changed since these bylaws
were written and there are some rather obvious problems with my list
above.
1. We don’t have an(other) office manager. For the nonce, I’m it.
2. Similarly, we don’t have an Editorial Advisory committee. Our
Journal Editor, Sethanne Howard, advises herself (and does a
' https://www.bls.gov/regions/mid-atlantic/data/xg-tables/ro3 x95 12.htm for the
Bureau of Labor Statistics.
Washington Academy of Sciences
1S)
fantastic job). Moreover, if we had an EA committee, I’m not
sure what purpose it would serve, except to give me something
else to do. However, the committee would be useful as a backup
pool of editorial assistants in the event of a resurgence of Journal
submissions.
3. Academy and Journal Operations aren’t actually tied that closely
together except through their respective finances, which are
coordinated between me and the Treasurer.
The Office Manager’s role is primarily then that of an inward facing
office, with data management responsibilities for keeping track of the
business cycle (subscriptions and membership data) and, to a lesser
degree, the publishing cycle (e.g. receiving and retaining Journal
overprints).
Outward facing data dissemination (and related data management)
responsibilities are carried out by several individuals. Currently, our
Journal Editor, Sethanne Howard, prepares the Quarterly editions of the
Journal of the WAS and produces our email based newsletters and
announcements. The Webmaster (a role currently filled by Paul
Arveson, who also serves as the VP of the Junior Academy), controls
our WEB content and administers the washacadsci.org email domain.
Finally, our social media presence is, at the moment, the responsibility
of our President Elect, Judy Staveley, with guidance and assistance from
Paul Arveson.
The Journal Editor is acknowledged in the Bylaws, but there is no
mention of the Social media, Email administration or Webmaster roles.
Since all of these duties and _ responsibilities are generally
undocumented, the person in each role must depend upon word of mouth
(from anonymous sources, mostly old timers and former officers of the
BOM) and ‘what feels right’. Ultimately, they must decide for
themselves how to discharge those duties and meet their responsibilities.
So, where is all of this heading? This year, the Washington Academy
of Sciences is 120 years old. As it has aged, it has also evolved and must
continue to do so. It’s old news that the worldwide adoption of
electronic technologies means that disruptive changes to enterprise
Winter 2018
business models have challenged organizations of all sizes and
intentions. Our business cycle (the annual cycle of the Academy) begins
each May with the turnover of the new BOM, led by a new President
and President Elect. Each of the other officers may remain in office
indefinitely, subject to their continuing to appear on the annual ballot.
It’s that property of incumbency that allows the presiding officers to be
replaced without sacrificing continuity of knowledge and understanding
of the Academy’s processes.
Establishing and documenting how the Academy can deal with our
changing world is a responsibility we all share. As Admin VP I am
responsible for coordinating the data and office management processes
of the Academy and for projecting how those processes are documented
and shared with the Academy membership. I must also be a collector of
insights into the changes the Academy must undergo to remain relevant
and useful for the DC science community. I find that the one day a week
that I can support this office doesn’t allow much time for an Enterprise
Architecture effort. Such an effort would, I believe, be the expected,
contemporary strategy for an organization to address issues of
transformation in the face of disruptive change. So, I invite all readers of
this Journal in the DC area with free time to travel downtown to
correspond with me about supporting either the Office Management or
Enterprise Architecture efforts. I welcome any suggestions from anyone
as to how best to deal with the situation I’ve described (or provide a
better understanding of the current status of the WAS). My email is:
admin(@washacadscl.org
The preceding summary has focused on the current work of the
Administration function as it relates to Office Management functions.
I’ve not said anything about the Archive responsibilities. As a member
of an ISO committee responsible for Digital Archive standards (ISO
14721, and related ISO 16363 and 16919) it’s embarrassing that I’ve not
spent more time on this aspect of the Admin job. My only excuse is that
my ISO focus area (Digital Archives) doesn’t really include the WAS
archives, which are mostly hardcopy. However, if anyone out there has
access to a system to convert paper documents to PDF files, I’d like to
talk to you, too.
Washington Academy of Sciences
Over the course of the next year, I plan to write more about the how
the Admin functions are executed and how they support and complement
the activities of the WAS.
Terry Longstreth (AKA Wallace Isaac Longstreth, III)
Winter 2018
Washington Academy of Sciences
Book Review
Accessory to War: The Unspoken Alliance between
Astrophysics and the Military
Authors:
Neil de Grasse Tyson and Avis Lang
Norton, 2018 ISBN 978-0-393-6-06444-5
The popular conception of astronomy and astrophysics is as an “ivory
tower” pursuit. The further understanding of the universe and processes in
it are considered as exploration for its own sake. The authors show that this
is not the case at all. Here is a summary of some areas which are described
in much more detail in the book.
By going back to the pretelescopic era of naked eye astronomy, the
book describes how, in the royal courts of Middle Ages Europe, the
astronomer was also an astrologer. Horoscopes were cast to determine the
proper dates and times for multiple activities, including war. Accurate
predictions of the planets, Sun, and Moon were crucial in casting
horoscopes. The development of improved planetary predictions by famous
astronomers such as Copernicus, Tycho Brahe, and Kepler had at their root
the practical motivation to cast better horoscopes. This role independently
originated in separate ancient civilizations such as China, India, and even
the Mayan civilization of Central America. Astronomers separated
themselves from astrology as it became clear that the stars and planets were
so far away as to have little effect on Earth-bound life. One quibble this
reviewer has is that the physical reason for this abandonment was not clearly
explained in the book. The Sun and Moon are exceptions to this lack via the
non-astrological tides and the seasons. One characteristic of science is
“reproducibility”. Venus in European astrology was the goddess of love
while the Mayans thought of it as a terrible god of war, completely different.
Beyond astrology, astronomy played a crucial role in the age of
colonial empires, from the Renaissance to the 20" century. Columbus’
application of the discovery of the spherical shape of the Earth to the
conquest of the New World is well described in the book. But Columbus
was not the first. The spherical shape was well known to the ancient Greeks
and used in determination of latitude angle from the equator even by the
Winter 2018
Vikings via angles of stars and the sun above the horizon. The hard problem
was determination of longitude solved in the 18'" century by the
development of accurate ship-board clocks. Local time found by
astronomical observations was compared to the time at, say, Greenwich
England as preserved by the clock. It was now possible to discover the
locations of new lands to settle or conquer and return home afterward. Soon
there were observatories in every major port to study the stars and, more
importantly, accurately determine time. Perhaps this gave astronomers
alternative employment to casting horoscopes! The book has an excellent
description of the other indispensable tool of sailors, the compass. World-
wide observations of deviations of the compass from true north were made
by astronomers such as Edmund Halley. The book makes clear that for
better or worse (trade of new products or the slave trade), astronomy played
a crucial role in the age of sea-based empires.
Then turning to the telescope, one thinks of great astronomical
discoveries such as Galileo’s first great observations of craters on the Moon
or the satellites of Jupiter. However, this book makes a detailed case that
the telescope was first seen as an instrument for war from the very start.
Galileo himself promoted its use to identify, for example, distant enemy
ships. Beyond simple optical telescopes, a crescendo is described of
refinements resulting in instruments today that would be unrecognizable as
telescopes to an old fashioned optical astronomer. Today telescopes use not
only the invisible electromagnetic wavelengths of UV, infrared, and radio,
but even gravitational waves from merging black holes and neutrinos from
the cores of active galaxies.
The latter part of the book is a very detailed description of what I
shall call the modern day weaponization of space. Expenditures for these
hidden activities are very large compared to the much better known
scientific explorations. Thankfully, the 1963 Limited Nuclear Test Ban
Treaty has led to the exclusion of nuclear weapons in space thus far. Related
to the Test Ban Treaty, in an interesting transfer from the military to
astronomy, a military satellite detected gamma rays thought to be from
treaty breaking nuclear tests. There was great concern until astronomers
were able to verify that the rays were not from the Earth but from other
galaxies. Thus was born a new area of astronomy.
Washington Academy of Sciences
Despite the Test Ban Treaty, other types of non-nuclear hostile
weapons such as hunter killer satellites have been tested.by the United
States and China and are now being developed by other countries. The
recent talk of a United States space force is merely a combination under one
command of many already existing efforts. Today, a worrisome impact
threat to orbiting satellites is “space junk”: debris from exploded satellites
or other space activities. Today, we are so dependent on satellites from
communication to GPS navigation that a flood of debris and attacks from
even a non-nuclear space war would, in the words of the book’s authors,
“be terrifying.” Today, worry about these terrible effects has resulted in a
stalemate of sorts. In addition to diplomacy, the authors hope that education
and better scientific understanding may avert a terrible future.
As a counter point to this book’s theme, recent astronomical
research has revealed beauty and a story which the general public seems to
value for itself with no military benefit. An example of the beauty revealed
is the famous “Pillars of Creation” photograph by the Hubble Space
Telescope of glowing gas clouds and forming stars beautiful even to those
who do not know what is happening in the photo. As a result of such images
a successful campaign was launched to keep the Space Telescope in
operation. Another scientific trend of no foreseeable military benefit are
revelations that there is a story connecting us personally to a sequence of
events reaching to the origin of the universe. For example, the iron in our
blood was created in exploding supernovas, and the hydrogen in the water
in our blood was created in the Big Bang itself.
In a final note, this book is very detailed with footnotes making up
a significant portion of the book. Probably it should be read in smaller
chapter-by-chapter doses rather than straight through. There is a trap (into
which this reviewer has fallen personally) of having extensive knowledge
of a subject which is all presented in an overwhelming manner. Although
one of the co-authors is an editor, this book needed a good editor to create
a version emphasizing the most important facets in a more digestible form
for the lay reader. I would recommend this book to a layperson who is
already well read in astronomy or space science.
Gene G. Byrd, Professor Emeritus, University of Alabama
Winter 2018
Washington Academy of Sciences
1]
Analyzing the Accuracy and Effectiveness of the EVI-
PAQ Trajectory Laser to Determine Bullet Trajectory
Reconstruction
Lydia Chance
Frederick Community College
Abstract
By following the written protocol on the use of a trajectory laser pointer,
we weighed its benefits when applying to a crime scene investigation. We
followed the standard protocol listed in the EVI-PAQ Trajectory kit and
demonstrated finding the angles of trajectory of blood droplets and/or
bullet holes. There are several methods to find an angle of trajectory, such
as using protrusion rods or a string alongside a protractor. Evidence
gathered is only as useful as the photographs taken to document it.
Therefore, ensuring that all photos taken are clear is a necessity. This
experiment in recreating a crime scene emphasizes the usefulness of the
trajectory laser and provide an in-depth review on its use in criminal
justice settings.
Introduction
ALTHOUGH THERE ARE ACCURATE WAYS to reconstruct the pathway a
bullet took through the air upon firing, I demonstrate the accuracy and easy
maneuverability of modern equipment such as a trajectory laser over the use
of string and protrusion rods alone. Implementing this modern method of
visualization in a crime scene recreation provides an invaluable experience
by showing the precision of the current techniques being used by crime
scene investigators of today. Using the EVI-PAQ Trajectory Kit one can
determine the point of origin of a shooter based on the angle of a bullet hole.
This laser kit contains several methods to determine the angle at which a
bullet struck a surface.
The reconstruction of bullet trajectory is often the last step in
recreating a crime scene, but that does not make it any less important than
collecting other forms of evidence. There are crime scene investigators who
work specifically in ballistics and specialize in interpreting the data
gathered by the trajectories and then speculating where the shot originated.
Often, the bullet trajectory will tell where a shooter was standing, and it is
Winter 2018
12
“x
reliable within the first 50 yards of travel for the bullet without having to
account for other variables such as gravity, air resistance, and yaw.
This evaluation consists of three phases: setting up the trajectory
laser using the components of the EVI-PAQ kit; demonstrating the ease of
use; applying the equipment to determine the trajectory of a bullet hole.
Methods
1. Acquire the Materials
1.1. The EVI-PAQ Kit mandates the use of the following materials and
equipment. The kit contains several methods for testing the
trajectory of a bullet hole, however, the laser will be used for this
experiment.
1.1.1. Trajectory rod kit with trajectory laser pointer
1.1.2. Protrusion rods
1.1.3. Protractor or angle finder
1.1.4. Reflective card
1.1.5. Camera with adjustable aperture and exposure
1.1.5.1. | One may require the use of photographic fog if the
area used for the laser is not dark enough or if there are
not enough small particles off which the laser could
reflect in midair.
1.1.6. Tripod
1.1.7. Wood board prepped with bullet holes
1.1.7.1. Bullet holes must be wide enough for the protrusion
rod; .22 caliber bullets may be too small
2. Photographing the Scene and Bullet Holes Before the Trajectory System
is Placed
2.1. Photographs of the scene must be taken before any obstruction
contaminates the crime scene.
2.2. Photographs of the bullet holes taken from each side with a scale
must be acquired.
2.2.1. Consider photographing the entire affected area if the bullet
holes are spread throughout on the same surface, then take the
close-up images with the scale.
2.2.2. Photographs should be taken from all angles.
2.2.2.1. Photograph the bullet holes from an angle
perpendicular to the hole, parallel to the surface on the
Washington Academy of Sciences
13
horizontal, and parallel to the surface from above if space
permits.
3. Preparing the EVI-PAQ Trajectory Laser and Protrusion Rods
3.1. The laser fastens to the end of the protrusion rod by screwing the
threaded end piece into the end of the rod. If necessary, one may
tighten a fastener to the laser and the rod to ensure stability.
3.2. Place the board with the bullet holes upright so that it is supported
on a flat surface.
3.2.1. The angle of the bullet hole in the board should lead the laser
to a point of origin that is non-reflective to avoid error in the
calculation of the angle.
3.3. Set the protrusion rod into the first bullet hole and push through
until the rod rests on the flat surface and its balance stabilizes.
3.4. Fasten the protrusion rod to the surface of the board if necessary.
3.5. Photograph the protrusion rod, after it is stabilized, from several
angles.
4. Lighting and Camera Settings
4.1. The lights in the room of the laser pointer must be shut off in order
to see it most clearly once turned on. Unless the scene is outdoors
and can be shot at night time, the lights must be off.
4.2. The camera should be prepared to take the photographs of the laser.
4.2.1. Use a long exposure to ensure the light of the laser is shown
clearly
4.2.1.1. An exposure may last up to three minutes to gather
the largest amount of light possible for clear photographs.
4.2.2. The tripod should be placed to capture the entirety of the
laser’s path.
4.2.2.1. The laser can project up to 5,000 feet but other
aspects of trajectory must be considered for any distance
greater than 50 yards and must be addressed during
calculations of an origin point.
4.2.3. The timer on the camera may be set to take the photograph
to avoid any shaking clicking down the capture button may
have caused.
4.2.3.1. Two seconds should give ample time for the camera
to steady itself on the tripod after being pressed.
Winter 2018
Turning on the Laser and Documenting Angles
5.1. Turn on the laser
5.2. Hit the capture button on the camera
5.3. Wait until the exposure stops before moving any aspect of the
Scene:
5.3.1. Moving the camera while the shutter is still capturing the
light will result in a blurred photograph containing a light trail
of the laser beam and will have to be redone.
5.4. Use a protractor to measure the angles
5.4.1. To produce an angle of trajectory, there must be two points
from which to measure. The entrance and exit can be used to
measure the angle of impact in thick materials.
5.4.1.1. | The bullet hole may also aid in producing the angle.
5.4.2. The laser will point to the third point necessary for
determining the point of origin or may pass through the origin
and continue on if the scene is large enough.
Figures | through 8 illustrate the various steps taken for documentation
and the methods applied to reconstruct the bullet trajectory.
Figure | The wood board is photographed from multiple angles with a scale
(white strip) to document the bullet holes.
Washington Academy of Sciences
Figure 2 The Prctiaciontre rod i is se merpengs in me rae hole ai the t fairey laser
attached to the end.
Figure 3 The ae of the Snir rod has a bullet tip mounted and it is placed
into the wood board to begin the angle recreation process.
Winter 2018
Figure 4 The camera is placed to capture the entire length of the laser's path.
Washington Academy of Sciences
c
. Government Veteran Owned, §
oS “Scientific Source of Lab Equipmen
x Scientific” $00.248.8030 fax 7
Be ees
Figure 5 The second cluster of holes had three entry and exit holes.
SSHONI
NO
mene
Pad
ro ee
2 etH
25
and
ws
yr
&Q :
Figure 6 Bullet hole C was measured from multiple angles.
Winter 2018
Veteran Owned, Small Bu
Government
of Lab Equipment, Suppl
,
Scientific Source
“Everything Scientific”
i
800.248.8030 fax 703,734.18
Figure 7 The third grouping had two bullet holes.
JNUSINS
UMWIUI3A0F ae
or
%
5
}
x-
Figure 8 The bullet holes within the third cluster were too small for the
protrusion rod to pass through and must have the angle of trajectory measured
with string and thinner protrusion rods than the ones in the kit.
Washington Academy of Sciences
Conclusion
By using a trajectory laser kit the point of origin is easier to
visualize, and the angle of trajectory can be measured. There was an
instance when the .22 ammunition hole did not allow the protrusion rod to
pass through; therefore, using the trajectory laser was not possible for this
example. Using a protrusion rod connected to the trajectory laser makes it
easier to measure the angle of trajectory from the wood and allows for clear
documentation of the angle with a protractor. Although the trajectory laser
is useful over long distances, it is often difficult to see outdoors or in well-
lit areas. The stability of the trajectory laser depends on the user holding the
camera button down and often results in wobbling as the exposure of the
camera starts. If the room is dark enough, the exposure of the camera can
be modified to let the correct amount of light in and still reflect the green of
the laser’s light through a white reflective card showing the position of the
laser in the scene. There is no way to determine exactly where the shooter
was standing because the laser will shoot from the endpoint of the bullet
hole to whichever hard surface it next comes in contact. Further speculation
allows crime scene analysts to determine the ultimate position of the shooter
by accounting for all the information gathered in the crime scene
reconstruction.
References
Saferstein, R. (2016). Forensic Science from the Crime Scene to the Crime
Lab. Hoboken, NJ: Pearson Education.
Staveley, J. (2015). An Introduction to Forensic Science BI 130
Laboratory Manual. Sagamore Beach, MA: Academx Publishing
Services, Inc.
Tomboc, Ricardo; personal communication. San Bernadino Police
Department Identification Bureau Identification Technician II
Winter 2018
Bio
Lydia Chance is a full-time student at Frederick Community College and
is currently majoring in general studies. She graduated from Middletown
High School and began attending classes at FCC in the fall of 2018. While
taking forensic biology as an honors course, Lydia was mentored by Dr.
Judy Staveley for her individual project featuring the use of a trajectory
laser kit. After earning an AA through the Honors College at FCC, she
plans to major in forensic psychology.
Washington Academy of Sciences
Reusable Models of Manufacturing Processes for
Discrete, Batch, and Continuous Production
Mahesh Mani', K.C. Morris!, Kevin W. Lyons’, William Z. Bernstein’
‘Allegheny Science and Technology, 2NIST
Abstract
This article explores the new ASTM E3012-16 International Standard Guide
for Characterizing Environmental Aspects of Manufacturing Processes, its
application and potential impact in the manufacturing industry. The standard
provides guidance for industries to examine unit manufacturing processes,
capture characteristics in terms of how they impact the environment, and
explore opportunities to be efficient and sustainable in their operations. The
standard further encourages formal representations for consistent and effective
deployment of manufacturing tools and reuse of data and information models
for automated analysis.
Introduction
TO REMAIN COMPETITIVE manufacturers today seek to improve
productivity while maintaining quality and meeting sustainability
objectives. With the manufacturing sector consuming a large percentage of
our national resources, smart manufacturing and sustainable manufacturing
implementations through process optimization hold tremendous potential
for improvement!*?:4, Being cognizant of the production improvement
opportunities is key to success. But where do we start? Starting at the
process level poses an opportunity — an opportunity to improve process
performance through the meticulous understanding of selected processes.
‘Mani, M., Madan, J., Lee, J. H., Lyons, K. W., & Gupta, S. K. (2013). Review on
Sustainability Characterization for Manufacturing Processes. National Institute of
Standards and Technology, Gaithersburg, MD, Report No. NISTIR, 7913
* Haapala, K.R., Zhao, F., Camelio, J., Sutherland, J.W., Skerlos, S.J., Dornfeld, D.A.,
Jawahir, I.S., Clarens, A.F. and Rickli, J.L., 2013. A review of engineering research in
sustainable manufacturing. Journal of Manufacturing Science and Engineering, 135(4),
p.041013
3 Stephan Mohr, Ken Somers, Steven Swartz, and Helga Vanthournout, Manufacturing
resource productivity, McKinsey Quarterly, June 2012.
4 https://itif.org/publications/2018/1 1/28/innovation-agenda-deep-decarbonization-
bridging-gaps-federal-energy-rdd
Winter 2018
i)
NO
Eventually, these individual opportunities can be harnessed at a systems
level where multiple manufacturing processes work in concert.
Characterization of process-level activities can empower better engineering
at higher levels of manufacturing automation and control. These control
levels are described in the widely-acknowledged enterprise to control
system hierarchy (ISA 95°). Besides this, the ISO 14000 family® of
environmental management standards are useful towards developing a
management approach to sustainability and retroactively comparing the
impacts of different comparable products. But, specific guidance for
manufacturers to characterize individual processes and _ identify
opportunities for improvement can be an added advantage. To provide such
guidance for industries to examine basic manufacturing processes (a.k.a.
unit manufacturing processes) ASTM International’ issued a set of
standards, including E2979-18°, E2986-18°, E2987-18'°, E3012-16!', and
E3096-18'*. These standard guidelines help manufacturers scrutinize and
capture the characteristics of individual processes in terms of how they
impact the environment, and look for opportunities to be more sustainable
in their operations.
This article specifically explores the new ASTM £301/2-16
International Standard Guide for Characterizing Environmental Aspects of
Manufacturing Processes'* and its consideration for use with discrete,
batch, and continuous production. The standard provides guidance for
industries to examine unit manufacturing processes, capture the
5 https://www.isa.org/isa95/
° https://www.iso.org/iso-14001-environmental-management.html
7https://www.astm.org/
8 ASTM International (2018). E2979-18: Standard Classification for Discarded Materials
from Manufacturing Facility and Associated Support Facilities.
° ASTM International (2018). E2986-18: Standard Guide for Evaluation of
Environmental Aspects of Sustainability of Manufacturing Processes.
'0 ASTM International (2018). E2987/E2987M-18: Standard Terminology for
Sustainable Manufacturing.
'! ASTM International (2016). E3012-16 Standard Guide for Characterizing
Environmental Aspects of Manufacturing Processes.
'2 ASTM International (2018). E3096-18 Standard Guide for Definition, Selection, and
Organization of Key Performance Indicators for Environmental Aspects of
Manufacturing Processes
'3 https://www.astm.org/E3012-16.htm
Washington Academy of Sciences
23
characteristics of those processes in terms of how they impact the
environment, and look for opportunities to be more sustainable in their
operations and improve their efficiency. The standard also encourages
standard representations for consistent and effective deployment of
manufacturing tools and reuse of data and information models.
Current Gaps and Potential for Standards
Several workshops '*>!> facilitated by the National Institute of
Standards and Technology (NIST)? across the U.S. have reiterated the
viewpoint that gaps exist in terms of measurement capabilities to connect
sustainable manufacturing practices with the promotion of resource
efficiency. Today’s practices for sustainability-related analysis for products
do not explicitly account for individual manufacturing processes. Current
practices fall short in promoting a science-based understanding of
individual processes critical for their performance improvement and
decision making'®'’, Formal methods for collection and consolidation of
sustainability related information on manufacturing processes is lacking.
The measurement science—including methods for process
description, performance metrics, and a corresponding information base for
unit manufacturing processes—will allow for a more consistent evaluation
of sustainability performance across manufacturing systems. Providing the
science in the form of best practices is a goal for the ASTM International
standard.
14M. M. Smullin; K. R. Haapala; M. Mani; K.C. Morris. ‘Using industry focus groups
review to identify Challenges in sustainable assessment theory and practice.” ASME
International Design and Engineering Technical Conferences & Computers and
Information in Engineering Conference, Charlotte 2016
ISW.Z. Bernstein ef al., 2018. ‘Research directions for an open unit manufacturing
process repository: A collaborative vision,’ Manufacturing Letters, 15 (B), pp.71-75
16 M, Mani, Madan, J., Lee, J. H., Lyons, K. W., & Gupta, S. K. (2014). Sustainability
characterization for manufacturing processes. International Journal of Production Research,
52(20), 5895-5912.
17 Duflou, J.R., Sutherland, J.W., Dornfeld, D., Herrmann, C., Jeswiet, J., Kara, S., Hauschild, M.
and Kellens, K., 2012. Towards energy and resource efficient manufacturing: A processes and
systems approach. CIRP Annals-Manufacturing Technology, 61(2), pp.587-609
Winter 2018
24
ASTM International Standards on Sustainability
ASTM International is a global leader in the development of
voluntary consensus standards. ASTM International formed the E60.13
Subcommittee on Sustainable Manufacturing to guide industry in best
practices to inform sustainability-related decisions. More information on
the standards published through this committee can be accessed from the
committee website ' . The E60.13 E3012-16 standard defines a
methodology to develop unit manufacturing process or UMP information
models. The standard contributes to the measurement science needed to
quantify sustainable manufacturing practices to the benefit industrial
competitiveness. Standard methods for describing the environmental
choices that a manufacturer makes allow them to improve their practices
and to differentiate themselves from the competition.
Application of the standard benefits manufacturing practices in two
ways. First, it raises consciousness about manufacturing processes, their
environmental impacts, and opportunities for their improvement. The goal
of applying the standard is to improve the environmental aspects of the
process through the definition of key performance indicators specific to an
individual process addressing potential enterprise level goals. Establishing
that rigor sets the stage for better informed decision-making and production
planning.
The new ASTM standard provides guidance to help manufacturers
effectively understand processes, capture process characteristics in terms of
decision making and, as a result, leads to more sustainable systems.
Secondly, the use of standard practices and formal representation methods
poises manufacturers for transition into scientific modeling environmental
impact, and identify opportunities for improvement. Characteristics of a
processes imply descriptions of what goes into and out of the process, how
the process transforms its inputs to outputs, and what types of information
is used in the transformation. The standard format defined in ASTM E3012-
16 provides a basis for ensuring that a specific set of details are defined and
that they are covered in a consistent manner. See Figure 1. In this way, the
standard offers a method to generate reusable constructs (UMP information
models) that provide a structured way of both understanding and specifying
18 https://www.astm.org/COMMIT/SUBCOMMIT/E6013.htm
Washington Academy of Sciences
oe
unit manufacturing processes. Such constructs presented in an abstract and
precise manner can be parameterized and reused in different application
contexts like information processing, simulation, and analysis. The standard
makes for better comparisons, increased reuse, and, in the end, more reliable
results.
Physical World : Digital World
Product and Process Information
* Equipment and material specifications
* Process Specifications
Communication
* Setup-operation-teardown instructions
* Control Prograrns and process control
* Product and engineering specifications
«Part geometries
* Production plans i .
«Quality plans Optimization
* KPIs and quality plans
*PLM and sustainability plans
* Safety documentation
Transformation
* tregy
© Material
Outputs Simulation / Design
* Product of Experiments
© By-Product
«Waste
© information
=Material &
consumables
Resources
* Outside factors
* Disturbance
*Equipment *Solid, liquid, emissions
* Tooling *Thermal, noise
* Fixtures
*Human Life Cycle
“Software Assessment
Graphical and formal representations
Figure. | Overview of the significance and use of this standard. UMPs store digital
representations of physical manufacturing assets and systems to enable engineering
analysis, e.g., optimization, simulation, and life cycle assessments.
Potential Impact
ASTM E3012-16 is a good starting point for creating reusable
descriptions of manufacturing processes that will ultimately realize process
analytics and tool integration. In addition to systematic characterizations of
processes, the formal representations for those characterizations support the
direct use of the information within a variety of applications. The most basic
application is to support effective communication by ensuring consistency
and completeness. More advanced applications include computational
analytics and comparison of performance information. The formal
information model described in the guide facilitates new software tool
development to link manufacturing information and analytics for
calculating environmental performance measures. Further, the standard
format paves the way for more specific software tools supporting the
development and extension of standardized data and information bases such
as Life Cycle Inventory (LCI). LCI data is extensively used in life cycle
assessments (LCA), part of the 14000 family of standards. The top down
Winter 2018
26
approach of the ISO 14000 family and the bottom up measurements
approach from ASTM standards are complimentary.
Formally defined UMP models can cater to different user
information from a variety of perspectives. For example, using the standard,
e a variety of stakeholders, e.g., plant managers, process engineers,
technicians and operators, can better understand and communicate
manufacturing processes through consistent and tailored views of
the model;
e manufacturing engineers can develop system models from the unit
manufacturing processes by linking them together to characterize
specific production plans for discrete batch or continuous
production;
e systems integrators can use models of manufacturing processes to
understand material and information flows, and
e manufacturers can capture their own data for LCA-based
environmental assessments by developing data sets representing the
environmental impacts of their unit processes, complimenting and
sharpening LCI data sets.
In a related work, the authors explored the use of the standard with
three use cases in the pulp and paper industry. The case studies showed the
utility of the draft standard as a guideline for composing data to characterize
manufacturing processes. The data, besides being useful for descriptive
purposes, was used in a simulation model to assess sustainability of the
manufacturing system.!??°
Scope of the Current Standard and Beyond
Leveraging unit process models is by no means a new idea to
continuous process industries, such as the Chemical Industry. For nearly a
century, mathematical representations of “unit operations,’ such as
'? Mani, M., Larborn, J., Johannson, B., Lyons, K., & Morris, KC. (2016). Standard
representations for sustainability characterization of industrial processes. Journal of
Manufacturing Science and Engineering
0 Rebouillat, L., Barletta, I., Johansson, B., Mani, M., Bernstein, W.Z., Morris, K.C. and
Lyons, K.W., 2016. Understanding sustainability data through unit manufacturing
process representations: a case study on stone production. Procedia CIRP, 57, pp.686-
691
Washington Academy of Sciences
27
filtration, evaporation, humidification and distillation, have been derived
for controlling both small-scale plants and industrial installations. 7!
Considering the longevity of the unit process-based approaches in Chemical
Engineering’, the authors envision its direct relevancy to the process
industry and beyond. The hope will be that the formal characterization of
UMPs across diverse industries would enhance existing analysis
frameworks, such as improving the precision of life cycle assessment, a
method that still is burdened with significant uncertainty.77ASTM E3012-
16 is designed to be relevant across different production types, including
discrete, batch, and continuous. The standard provides a fundamental
representation to support unit manufacturing process in all of these
production settings. Characterizing the bounds of each unit manufacturing
process drives insight into each process’s functional characteristics.
The current standard is a first step to facilitate studies of existing
processes and to make those studies more accessible in the future. It can
serve as the basis for the development of production system models to better
understand process flows and interactions between and across different
processes. A repository of UMP models can be used for planning both to
retrofit existing facilities or for new facilities. Designs for new facilities are
almost always based on prior experience with operating processes and
realistic models should prove useful especially for verification and
validation activities.
The perceived scientific benefits to manufacturers from application
of the standard include reduced operational costs, improved prediction of
product costs, improved schedule, maximization of manufacturing
resources, improved control of product quality, and incorporation of best
practices. Modeling individual manufacturing processes facilitates the
generation of quantifiable evidence that improvements are being made. The
standard provides a uniform and repeatable way for more practitioners to
reap these benefits.
21 Walker, W.H., Lewis, W.K. and McAdams, W.H., 1923. Principles of chemical
engineering. London: McGraw-Hill Publishing Co
22 Turton, R., Bailie, R.C., Whiting, W.B. and Shaeiwitz, J.A., 2008. Analysis, synthesis
and design of chemical processes. Pearson Education.
23 Jacquemin, L., Pontalier, P.Y. and Sablayrolles, C., 2012. Life cycle assessment (LCA)
applied to the process industry: a review. The International Journal of Life Cycle
Assessment, 17(8), pp. 1028-1041
Winter 2018
The standard will be of interest to software providers across
industries interested in providing analysis and modeling/simulation
solutions to manufacturers. The standard format promotes information
exchange and communication through digitalization of manufacturing
assets for decision making purposes. Moving forward, with contribution
from industries, future standards can encompass a broader set of processes
and functionalities using ASTM E3012-16 as a platform on which to build.
Further, the creation of a repository of models should reduce modeling time
and improve model verification and validation activities. The creation of a
repository of models also provides a forum for industries to come up with
best practices and target sets of UMP models for common processes as
reference data.”
Future Work and Conclusions
As a relatively new standard co-developed by supportive
manufacturers, the ASTM task group is now seeking more participation
from across industries, especially SMEs, to demonstrate and further
improve the standards. The standard has already received some attention
and efforts are underway to spread the word. Much of the vision for the
work will require further research and future standards based on real world
experience”. UMP-focused industrial case studies are of interest to the task
group. NIST has already hosted two competitions, and will host a third, to
apply the standard to existing process models.*° This resulted in a diverse
set of models and focused attention within the educational world. To realize
the promise of reusing such models and automating analytics and system
integration for manufacturing significant research challenges remain
including advancements in the following areas
e Knowledge and understanding of UMP modeling. This includes
novel formal representations and methodologies, more accurate or
specialized metric, metric representations that support cascading to
“4 W. Z. Bernstein; M. Mani; K. W. Lyons; K.C. Morris; B. Johansson. ‘An Open Web-
Based Repository for Capturing Manufacturing Process Information.” ASME
International and Design and Engineering Technical Conferences & Computers and
Information in Engineering Conference, Charlotte 2016
*W.Z. Bernstein et al., 2018. ‘Research directions for an open unit manufacturing
process repository: A collaborative vision,’ Manufacturing Letters, 15 (B), pp.71-75
6 https://www.nist.gov/news-events/events/2018/01/ramp-reusable-abstractions-
manufacturing-processes
Washington Academy of Sciences
29
higher production levels, or exploration of variations for families of
UMP models.
Standards supporting models reuse. This includes automated
methods that allow linking of UMP models into systems, facilitating
system composition through naming conventions or other methods,
generalization that unifies a collection of processes, or standards-based
methods for integration with applications.
Techniques for development and validation of UMP models. This
includes demonstration of validation techniques for the effectiveness
and accuracy of the UMP models or techniques for producing useful
derivatives of UMP models or creative methods for mining
documentary model descriptions into formal representations.
As more groups apply the standard in their domains, the shared
experience will provide a basis on which to further understand
standardization needs and opportunities. Formal methods for acquiring and
exchanging information about manufacturing processes will lead to
consistent characterizations and help establish a collection for reusable
models. Standardized methods will ensure effective communication of
computational analytics and sharing of sustainability performance data.
NIST is also looking for manufacturers to collaborate on pre-pilot projects
to contribute to the collection of use cases for the standard. In conclusion,
the use of a reusable standard format should result in models suitable for
automated inclusion in a system analysis, such as a system simulation model
or an optimization program
Winter 2018
30
Bios
Mahesh Mani is a Senior Technology Adviser with Allegheny Science and
Technology supporting the Advanced Manufacturing Office of the
Department of Energy. His research interests include smart, sustainable and
additive manufacturing.
KC Morris leads a group at the National Institute of Standards and
Technology focused on standards to infuse smart technologies into the
manufacturing sector while ensuring that new practices lead to more
competitive and sustainable manufacturing. Currently, KC is on detail to
the US House of Representatives serving as an ASME Manufacturing
Fellow.
Kevin W. Lyons recently retired from the National Institute of Standards
and Technology. His research interests include sustainable manufacturing,
nano manufacturing, design, process modeling, assembly, virtual assembly,
and additive manufacturing technologies.
William Z. Bernstein is a research engineer at the National Institute of
Standards and Technology. Dr. Bernstein currently leads the Product
Lifecycle Data Exploration and Visualization project. His research interests
include advanced visualization, information modeling, and sustainable
manufacturing.
Washington Academy of Sciences
31
Generating Domain Terminologies using Root- and
Rule-Based Terms’
Jacob Collard', T. N. Bhat”, Eswaran Subrahmanian>*, Ram D. Sriram?,
John T. Elliot*, Ursula R. Kattner’, Carelyn E. Campbell’, Ira Monarch4
Independent Consultant, Ithaca, New York',
Materials Measurement Laboratory, National Institute of Standards and Technology,
Gaithersburg, MD”,
Information technology Laboratory, National Institute of Standards and Technology,
Gaithersburg, MD °,
Carnegie Mellon University, Pittsburgh, PA‘,
Independent Consultant, Pittsburgh, PA®
Abstract
Motivated by the need for flexible, intuitive, reusable, and normalized
terminology for guiding search and building ontologies, we present a general
approach for generating sets of such terminologies from natural language
documents. The terms that this approach generates are root- and rule-based terms,
generated by a series of rules designed to be flexible, to evolve, and, perhaps most
important, to protect against ambiguity and standardize semantically similar but
syntactically distinct phrases to a normal form. This approach combines several
linguistic and computational methods that can be automated with the help of
training sets to quickly and consistently extract normalized terms. We discuss how
this can be extended as natural language technologies improve, and how the
strategy applies to common use-cases such as search, document entry and
archiving, and identifying, tracking, and predicting scientific and technological
trends.
1. Introduction
1.1 Terminologies and Semantic Technologies
SERVICES AND APPLICATIONS ON THE WORLD-WIDE WEB, as well as
standards defined by the World Wide Web Consortium (W3C), the primary
standards organization for the web, have been integrating semantic
technologies into the Internet since 2001 (Koivunen and Miller 2001).
These technologies have the goal of improving the interoperability of
| Commercial products are identified in this article to adequately specify the material.
This does not imply recommendation or endorsement by the National Institute of
Standards and Technology, nor does it imply the materials identified are necessarily the
best available for the purpose. |
Special thanks to Sarala Padi for assistance in compiling and presenting this document
Winter 2018
32
applications on the rapidly growing Internet and creating a comprehensive
network of data that goes beyond the unstructured documents that made up
previous generations of the web (Berners-Lee, Hendler, and Lassila 2001;
Feigenbaum et al. 2007; Swartz 2013). These semantic technologies can be
used to protect against ambiguity and reduce semantically similar but
syntactically distinct phrases to normalized forms. Semantically normalized
forms allow users to more easily interact with data and developers to reuse
data across applications. However, most of these technologies rely on data
that have been annotated with semantic information. Data that were not
designed for use on the semantic web most likely do not include this
information and are therefore more difficult to integrate. For example,
scientific research papers are typically text- and graphics-based documents
designed to be read and processed by humans. These documents do not
usually contain semantic markup, meaning that search engines may not be
able to take advantage of such advances in data technologies.
Another major issue in semantic computing is the representation of
domain-specific semantics. Different scientific and academic disciplines, as
well as other spheres of communication such as conversation, business
interactions, and literature, all have overlapping vocabulary. However, the
same words often have different meanings depending on the domain. In the
sciences, each field (and often each subfield) has its own terminology that
is not used in other disciplines, or that conflicts with the language of more
general-purpose communication. For example, in general use, the word
fluid typically includes liquids, but not gases. However, in physics, gas and
liquid are both hyponyms of fluid. Semantic technologies may assume that
two annotations with the same name have the same semantics, when this is
not necessarily the case.
Generally speaking, this issue stems from the problem of
coordination described by Clark and Wilkes-Gibbs (1986). Any participant
in a system of communication is typically missing some of the knowledge
held by other participants. Because of this knowledge gap, participants may
not understand one another unless they are using a shared knowledge system
and a shared vocabulary. Clark and Wilkes-Gibbs (1986) describe how
speakers establish a common ground, collaborating to ensure that
participants know one another’s strengths and limitations. Interactions
involving computers are also systems of communication, and must also
coordinate in order to ensure that all applications are communicating
properly. Establishing common ground across fields is supported by
Washington Academy of Sciences
33
standardizing terminology and having hyponymic and other semantic
relations structured for use by humans and machines.
Issues of coordination are relevant in many fields, particularly when
it comes to data re-use and interoperability. For example, The Minerals,
Metals, & Materials Society (TMS) describes many gaps and limitations in
current materials science standards; one of these gaps is an “insufficient
number of open data repositories,” referring to repositories containing data
that can be used by many applications, with the stipulation that data not only
be available, but also be re-usable (“Modeling Across Scales” 2015). This
is impossible without some means of coordination and standardization of
terminology. TMS recommends developing initiatives to aid in
coordination, for example by engaging “a multidisciplinary group of
researchers to define terminology and build bridges across disciplines.”
Multidisciplinary coordination is a necessary part of improving data
reusability, but support of such coordination is lacking.
Our goal in this paper is to describe a general system that is capable
of automatically creating standardized terminologies that will be useful for
developing domain ontologies and other data structures to fill this gap. A
domain ontology is a collection of concepts (represented as terms) and
relations between them that correspond to knowledge about a particular
family of topics. Our system will take into account potential issues in
terminology generation, including the disambiguation and normalization of
terms in a domain. In many cases, a single term can be expressed in many
different ways in natural language. For example, in mathematics the phrase
“without loss of generality” has a technical meaning; however, a researcher
might also write “without any loss of generality” or “without losing
generality.” All three variants refer to the same technical phrase and should
be treated together in a standard terminology. Terminologies also face
issues relating to polysemy, syntactic ambiguity, context sensitivity, and
noisy data.
The key component of our system is a representation of terminology
that takes advantage of the compositional nature of natural language
semantics by converting natural language phrases into consistently
structured terms. This representation overcomes issues of syntactic
variation by normalizing different syntactic structures based on their
compositional semantics. That is, we represent synonymous phrases in the
same way despite differences in surface realization. To help ensure
consistent terminology generation, our system uses a set of rules to restrict
and guide the formation of these normalized terms. Because this system is
Winter 2018
34
based on a modular set of rules that combine smaller components of phrases
into standardized terms, we refer to it as a root- and rule-based method
for terminology generation.
Our root- and rule-based approach is motivated more by linguistic
than statistical models. This is necessary for the construction of rule-based
terms, which are dependent on consistent structures and a representation of
the underlying linguistic form. Our rule-based model ultimately relies on
the way that words come together syntactically in order to form phrases.
The meanings of these phrases are, in most cases, related to the meanings
of their component parts (i.e. the individual words). The way that words
compose to form more complex meanings is detailed in research such as
Montague (1988), though the underlying principles ultimately extend back
to Frege (1884). Through an understanding of syntax as modeled in
linguistics, it is possible to formalize the compositionality and therefore
normalize synonymous phrases, despite significant differences in form.
Key Phrases
Key Phrase Extractor
Tethered Root Generator Super Root Generator
Term Generator
Figure 1. Terminology Generation
Within the context of our root- and rule-based system, a term is a
representation of a concept within a domain (and may cover a number of
words and/or phrases); a collection of terms describing the same domain
make up a terminology. Our system also defines roots, which are smaller
components which come together to make up terms — that is, a single term
is made up of one or more roots. A terminology is distinct from an ontology
in that an ontology additionally defines the relationships between concepts,
though the concepts in an ontology are typically represented by terms of
some sort. A rule is a codified process used to generate, restrict, or
normalize terms in our root- and rule-based approach. This paper will
Washington Academy of Sciences
describe the linguistic and theoretical motivations behind these rules, which
are introduced in Bhat ef al. (2015) as specifically applied to materials
science. In Section 2, we describe the theory behind the syntactic
normalization that allows for the automatic construction of root- and rule-
based terms. This is followed by Section 3, where we describe how to use
features of natural language syntax in combination with additional rules to
create root- and rule-based terms. We then describe how terms can be
extracted from natural language texts in Section 4. One of the main features
of the root- and rule-based approach is that it is easily extensible and
adaptable to new and different use-cases, as we discuss in Section 5. Lastly,
we describe how our system ties in with the challenge described above of
creating terminologies that are robust to the complexities of natural
language and to the needs of users in Section 6.
1.2 Use-Cases and Architecture
In developing this strategy, we have considered four very general
use-cases, each of which corresponds with an interface that allows users and
administrators to interact with the terminology generation system. These
will be discussed in greater detail in Section 6. We also discuss ontology
generation as an extension of terminologies generated from the root- and
rule-based method.
* Document Entry: Users should be able to upload documents, have
terms extracted from these documents, make changes to the suggested
terms, and have the terms added to the terminology in the database.
5 Document Retrieval: Users should be able to construct a query and
receive a list of documents matching the terms in the query.
5 Curation: Curators (who may be dedicated administrators or user
volunteers in crowd-sourced systems) should be able to make changes
to the terminology.
° Rule Changes: Curators should be able to make changes to the way
the system generates terms, as technologies change. Changes also
reflect the way that people are using the terminology and systems that
have become de facto standards.
. Ontology Generation: Users should be able to use a set of terms as
the basis for a domain ontology, which extends a terminology by
providing additional semantic relationships between terms.
Winter 2018
36
With these in mind, we have outlined a root- and rule-based terminology
system in Figure 1. This figure shows the general process which will be
explained in this paper. Through this process, a set of rules are used to
extract a series of key phrases from a corpus and convert them into root-
based terms (three types of roots are described in this paper: roots, tethered
roots, and super-roots; see Section 3).
2. Theory
Generating a terminology requires identifying salient concepts in a
corpus and constructing representations of those concepts. There are many
different approaches to identifying salient concepts, that is, to find words
and/or phrases in the text that stand out. In many cases, the identification of
salient concepts produces a list of words and phrases taken directly from the
text. A terminology generation system then needs to convert words and
phrases into a format which enhances the potential for humans and
machines to use the terminology for various practical applications. As with
key phrase extraction, researchers and developers have used a variety of
methods to convert words and phrases into terms representative of key
concepts (Witschel 2005).
2.1 Previous Research
Key phrase selection is a major area of study in the field of
information extraction (Witschel 2005). Many methods of key phrase
extraction rely on two components: a unithood metric and a termhood
metric. A unithood metric determines the particular types of words and
phrases that qualify as key phrase candidates. Units found by a unithood
metric may or may not be relevant enough to qualify as key phrases, but can
be used to restrict the set of words that must be compared for relevance. For
example, a unithood metric may identify all noun phrases in a corpus, so
that only noun phrases are considered as potential terms. Not all of the noun
phrases selected will become terms, but only noun phrases will be extracted.
More complex unithood metrics are also possible. Frantzi, Ananiadou, and
Tsujii (1998), for example, consider the following unithood metrics in the
evaluation of their C-Value and NC- Value algorithms, which are algorithms
for key phrase extraction’:
* Tn this representation Noun, Adj, and Prep are patterns matching parts of speech (noun,
adjective, and preposition, respectively). Parentheses group patterns together, and two
patterns separated by a pipe (j) produce a new pattern which matches either of its
components. The asterisk (*) produces a pattern that matches the previous expression
zero or more times. The plus sign (+) is similar, but matches the previous expression
Washington Academy of Sciences
37
. Noun’ Noun
. (Adj|Noun)’ Noun
° ((Adj/Noun)'|((AdjjNoun)‘(Noun Prep)’)(Adj/Noun)")Noun
A typical automatic term recognition algorithm may then identify
which of the selected units are relevant using a termhood metric. A
termhood metric may be based on statistical or linguistic features; Frantzi,
Ananiadou, and Tsujii (1998) use word frequency to identify nested terms
(candidates which occur within other candidates) and the surrounding
context of a term. These are used together with a mathematical formula to
assign a score to each candidate. In this way, they are able to select for
particular types of terms. The features used by Frantzi, Ananiadou, and
Tsujii (1998) are by no means exhaustive; Proux ef al. (1998) and
Rindflesch, Hunter, and Aronson (1999), for example, make use of
linguistic information such as part-of-speech tags to improve their termhood
metric.
We do not present any new methods for automatic term recognition,
nor do we make any judgment as to the “best” contemporary method.
However, because the model of term generation and normalization that we
describe is dependent on automatic term recognition, we do discuss it
briefly. In theory, our system can be used with any automatic term
recognition algorithm, though multi-word terms, such as those recognized
by the C/NC-Value Algorithm (Frantzi, Ananiadou, and Tsujii 1998) are
optimal for the root- and rule-based method as hierarchical relationships can
be inferred from these complex terms.
Once terms have been selected, we generate normalized concepts
rather than using natural language phrases. Natural language phrases have
many disadvantages, including ambiguity and synonymy, as discussed
previously. One method of normalizing concepts is to automatically group
words into clusters, as in Liu et al. (2012). For example, the phrases
“monthly expense,” “personal insurance product,” “core product,”
“voluntary benefit,” and “personal insurance” may all be clustered to form
a single concept representation related to insurance. In this way, word
choice among different authors and contexts is normalized — words whose
appearance is positively correlated are grouped together. The disadvantage
one or more times, while the question mark (?) matches the previous expression zero or
one times.
Winter 2018
38
of clusters is that they are difficult to label — other than appearing as the set
of words in a given cluster, they are not human-readable.
Other approaches to normalization may involve various other
statistical and natural-language processing techniques, such as Park, Byrd,
and Boguraev (2002), who use a combination of stop word removal,
lemmatization (normalization of different forms of a word, .e.g, people and
person or colors and color), and abbreviation detection. There are various
components of a key phrase that may not be desirable in terminology
generation. Inflectional morphology — grammatical affixes such as -s and -
ing provide grammatical information within the context of a natural
language sentence, does not usually differentiate between technical terms.
A terminology should not usually extract both heat capacity and heat
capacities — these are probably both instances of the same term. Certain
functional words, including articles such as a, an, and the may also be
unhelpful. These issues can be dealt with through lemmatization and stop
word removal, both of which are well-known problems with many proposed
solutions in the field of natural language processing (Park, Byrd, and
Boguraev 2002). However, even assuming that we can lemmatize phrases
and remove stop words, there may still be undesirable redundancies in an
automatically generated terminology, as there are many ways to express the
same thing by using different syntactic structures.
Consider, for example, the syntactically similar phrases the red tree
leaves and the red leaves of the tree. In many contexts, these phrases have
approximately the same meaning — they both refer to leaves which belong
or grow on a tree and are red in color. When dealing with phrases such as
these, a terminology extraction algorithm should be able to reduce
redundancy by converting both of these terms into a normalized syntactic
pattern. A great deal of theoretical and applied linguistic and computational
research has been done regarding the determination of a sentence or
phrase’s syntactic structure. By applying dependency parsing to
terminology extraction, it becomes possible to normalize the syntactic
structure of phrases. While many other terminology generation algorithms
work with phrases, one of the main benefits of our approach is the ability to
normalize surface-level differences in the structure of these phrases.
2.2 Dependency Grammar
As mentioned above, we use dependency parsing to normalize the
syntactic structure of phrases. A dependency parser is a tool for syntactic
analysis that produces a dependency tree, which is defined in terms of the
Washington Academy of Sciences
39
relationship between a phrasal head and the rest of the phrase (Tesniére
1959). The phrasal head carries the syntactic category of the phrase — e.g.,
a noun phrase is headed by a noun, a verb phrase by a verb, etc.
Furthermore, a dependency represents some semantic information, as the
head of a phrase is typically specified by the rest of the phrase. For example,
the phrase the tree’s red leaves (a noun phrase) is headed by the word
leaves. The word /eaves on its own is quite general — it could refer to the
leaves of any plant whatsoever. However, because of the rest of the phrase,
we know that the leaves in question are red and that they belong to or come
from a tree.
Dependency syntax can be represented using a tree structure such as
that in Figure 2. Modern parsing technologies, such as the Stanford Parser
(Manning et al. 2014) and MaltParser (Hall 2006) are capable of
automatically generating dependency trees from text. With the help of these
tools we can take advantage of the semantic information provided by
syntactic structures and normalize the syntax of phrases to generate terms.
leaves
ne ae
tree’s red
Ff
the
Figure 2: Dependency representation of the tree’s red leaves
3. Representing Root- and Rule-Based Terms
3.1 Guidelines for Terminology Representation
Because the goal of providing a domain terminology is to produce a
list of formal concepts in the domain, it is necessary that generated terms be
both unambiguous and relevant to the domain. If terms are ambiguous, then
the terminology will be inaccurate. If terms are irrelevant, even if their
semantics are correctly represented, the terminology will not be of any
practical use. We have defined the following criteria to describe useful
terms for domain terminologies, based on rules 1 to 10in (Bhat et al. 2015)
(reproduced in Section 7) These criteria should generally apply to any
terminology generation schema.
Winter 2018
40
° Terms should be human-readable and machine-friendly. All terms
should be based on natural language.
° The same term representation should always identify the same
concept within a terminology; similarly, two terms with different
representations should represent different concepts (see Bhat ef al.
(2015); rules5, 7, ands).
’ The meaning and form of a term should be predictable from smaller
parts. This predictability must be applicable to both humans and
machines (rules 8, 9, and 10).
’ The form of a term should be predictable — that is, given a particular
meaning, it should be possible to derive a compositional name for a
term with that meaning (rules 2, 3, and 6).
° Given a term’s compositionality, both humans and machines should
be able to identify semantic relationships.
’ Terms should be intuitive enough that both humans and machines can
identify existing semantic relationships between them.
° Terms should be mutable enough that new terms with related
semantics can be generated (rules 3, 4, and 8)
° Only terms representing discriminating concepts should be generated
(rules 2 and 8).
° Terms representing highly specific instances and individuals should
generally not be generated; terms should be reusable in many use-
cases (rule 9).
Generally speaking, terms will represent a hierarchy, with some
concepts being more specific than others. Thus, there is no concrete
definition for “too general” or “too specific” as applied to a terminology;
we are simply looking for terms that are useful at the level of specification
provided by the domain, keeping in mind that terms should be usable for
data representation, sharing, and analysis. The domain terminology should
be representative of the input corpus.
3.3 Normalized Dependency Trees
The term representation that we have developed takes advantage of
common features of natural language to create a human-readable schema
that maintains the stipulations given above (though building more complex
structures, such as ontologies, is also dependent on other compounds, such
as the term selection strategies discussed in Section 4) and Section 8.
Washington Academy of Sciences
4]
The theoretical framework for our term representation is the
syntactic structure of phrases and dependency grammar, as discussed in
subsection 2.2. Though dependency trees are not easily human-readable,
they can be converted into human-readable forms, due to one important
feature: given a dependency parse, it is not the order of the nodes which
determines the meaning of the phrase — all isomorphic trees have the same
meaning. The hierarchical structure of a dependency tree represents all of
the semantic information that our strategy relies on; the linear order of the
daughter nodes represents only the surface ordering of morphemes and does
not on its own contain any semantic information. That is, the trees shown in
Figure 3 and Figure 4 are semantically very similar, despite differences in
node order. Automatic dependency parsers, such as MaltParser (Hall 2006),
will produce similar trees, though we have filtered function words from
these in order to improve normalization.
leaf
i ties
tree red
Figure 3: Collapsed representation of the leaves of the tree that are red
leaf
da
red tree
Figure 4: Collapsed representation of the red leaves of the tree
Because of this fact we can re-order the nodes in a set of trees such
that all trees follow the same pattern. In this way trees that are different, but
represent the same concept, will generally be normalized to the same
structure. For example, if both Figure 3 and Figure 4 were changed such
that all nodes were exclusively left-branching (1.e. in a form such that the
daughters of a node all appear to the left of the node), they would be
identical. This creates a normalized dependency tree that we use to create
normalized representations of terminology.
Creating these normalized representations from key phrases is a
three-step process: first, a dependency parser creates a dependency tree for
each input phrase. Second, a filter removes all function words and other stop
words such as prepositions from the dependency trees. Third, each
dependency tree is made entirely left-branching.
Winter 2018
42
Though these normalized trees are good representations of phrasal
semantics, they are not easily understood by human users of a terminology.
Understanding these trees requires an understanding of dependency
grammar, a potentially non-intuitive concept. Instead, our strategy converts
these normalized trees into human-readable forms using a number of
concrete rules. These new representations are linear and can be stored as
simple strings of characters.
For example the trees in Figure 3 and Figure 4 can be linearized by
first converting the trees into normalized form (Figure 5). Once we have
normalized the syntax, we can convert the structure into a linear format,
such aS RED-TREE_LEAF. This format contains the same information as the
tree structure and is easily interpreted by English speakers and by
computers. For English speakers the linear form corresponds to a standard
English phrase with the same meaning (the red tree leaf). For computers the
underscore (“_’) indicates that the two final roots (tree and /eaf) compose
first, followed by red, which is equivalent to the structural information
shown in the tree. We have not yet discussed exactly how this linear format
is reached from the dependency tree; this, including the semantics of the
hyphen and underscore delimiters, will be explained in the following
section.
leaf
nah
red_ tree
Figure 5: Normalized tree for RED-TREE_LEAF
3.2.2 Roots and Terms
Just as the above structural representations depend on the principle
of compositionality (Frege 1884) and the formulations of compositional
semantics (Montague 1988), the linear term representation that we have
developed facilitates breaking terms apart into smaller components. Unlike
in natural language, we primarily use compounding, combining individual
“words” into larger terms. The individual meaningful components of a term
are called roots, and cannot usually be broken down into further meaningful
parts. A root should correspond to a single meaningful word such as tree,
electron, or computational. Roots can be combined in various ways to create
structured terms, which are semantically complex structures whose
meaning can be easily determined from their component parts. This is based
Washington Academy of Sciences
43
on rules 1, 2, and 3, as well as the specialized terminology in Bhat ef al.
(2073):
Roots can combine in different ways; each method of combination
is represented textually by a unique delimiter, including the underscore
(‘_’), the hyphen (‘-’), and the colon (‘:’). This delimiter notation is
extensible and replaceable (completely different sets of delimiters may be
used); new delimiters can be added to express new relationships between
roots. The hyphen is a general delimiter used for combining roots into terms.
Depending on the use-case of the terminology, the hyphen may have a
slightly different meaning, but it should usually be used to add specificity
to a root through the addition of a second root. For example, the term TREE-
LEAF 1s composed of the roots TREE and LEAF. The hyphen indicates that the
root /eaf (a very general term) is made more specific through the addition
of the root TREE, which indicates that the term as a whole represents a
specific type of LEAF — namely, the leaf of a tree, rather than the leaf of a
bush or other plant. Forms such as this can be easily derived from syntactic
structure: both the construction of terms from roots and the structure of
syntactic dependency indicate the modifier-head relationship between two
roots. In dependency structures, the dependencies of a root are its modifiers,
just as a root (the head) is modified by the preceding root.
The interpretation of terms with three or more roots could be
ambiguous. However, we impose a left-branching dependency syntax on all
terms, meaning that the roots in a term compose from left to right. For
example the term OAK-TREE-LEAF refers to the leaf of an OAK-TREE, and
OAK-TREE refers to a tree of type oak. The first semantic operation is the
composition of OAK and TREE. This is followed by the composition of OAK-
TREE (as a single concept) and LEAF.
Combining roots with more complex structures requires additional
delimiters. For example, as described above, the term RED-TREE-LEAF refers
to the leaf of a red tree, not to the red leaf of a tree — that is, in this term, the
tree is red but the leaf is not. The English phrase “red tree leaf” is ambiguous
in a way that the term RED-TREE_LEAF is not. This is why, if we are trying
to create a term with the interpretation the red leaf of a tree (the dependency
structure in Figure 5), it cannot be represented using hyphens alone. The
second delimiter, the underscore (‘_’) has higher precedence in the
compositional order of operations — composition of roots delimited by
underscores occurs before the composition of roots delimited by hyphens.
The latter has lower precedence in decomposition. Two roots delimited by
an underscore are called “super-roots” and allow for terms that cannot be
Winter 2018
44
expressed solely by hyphens. For example, “the red leaf of a tree” (Figure
5) can be represented as the term RED-TREE_ LEAF. TREE_LEAF 1s a super-
root and thus combines first: the term represents the leaf of a tree. The super-
root then combines with RED, specifying that the TREE-LEAF is of the color
red.
We define two additional methods for combining roots, though more
methods can be added based on use-case (see Figure 5). The first is to
combine the roots without any delimiter at all, referred to as creating a
“tethered root.” For example, if TREE and LEAF were to be combined into a
tethered root, they would become TREELEAF. The purpose of tethered roots
is to create roots that are composed of multiple words in English, but which
have meaning only as a whole phrase. If the components of a tethered root
are represented as roots in a term, the meaning of the term will not follow
from its component parts. Set phrases such as “gray area” (referring to a
situation that does not easily fit into preexisting categories) are strong
candidates for tethered roots. More generally, tethered roots are useful when
the component parts are not usable the same way in other terms. For
example, if GRAY-AREA is treated as a term, there should be other terms of
the form GRAY-X where “gray” has the same meaning as in GRAY-AREA.
However, because this is not possible, it is preferable to create a tethered
root: GRAYAREA.
The last delimiter we describe in this paper is the colon (‘:’). Two or
more terms can be combined into compounded terms with this delimiter.
A compounded term represents a high-level semantic cluster, though the
usage of these terms is dependent on the use-case. While most terms
represent concepts, compounded terms can also represent relationships
between those concepts. For example, because APPLE-TREE and
RASPBERRY-PLANT both refer to plants that bear fruit, the compounded term
APPLE-TREE:RASPBERRY-PLANT could represent this fact about the two
component terms. The exact type of relationship is not specified by the
compounded term, which describes only a very general semantic connection
between its components.
All of these combinations can be generated automatically from a list
of key phrases using dependency parsing and a training set. Roots are
usually equivalent to nodes in a dependency structure. Because of this, they
can be combined into super-roots and into terms by examining an
automatically generated dependency tree, removing unimportant words,
and performing automatic lemmatization. Creating tethered roots and
compounded terms cannot be done with dependency trees alone, and
Washington Academy of Sciences
45
requires the use of a training set. Tethered roots are formed when splitting
the term does not provide any reusable information. This can be measured
using statistics such as term frequency-inverse document frequency (a
measure of a words importance in a document relative to its overall
frequency) (Wu et al. 2008). Compounded terms can be identified using
measurements of co-occurrence frequency, which identify semantic
relationships between terms (Kostoff 1993). Because terms are
unambiguous, and different relationships between roots are represented by
different delimiters, a machine can also easily break down a term into its
component parts, just as it can build up a term based on the relationships
between the components (Table 1).
Table 1. Summary of term syntax
Root Type (delimiter) Description Example
Term (-) Composite Concept OAK-TREE
Root Single Concept TREE
Super-Root High-precedence Composite TREE LEA
Tethered Root Multi-word Single Concept GRAYAREA
Compounded Term (:) Two related terms APPLE-
TREE:RASPBERRY-
PLANT
4. Key Phrase Extraction for Root- and Rule-Based Terms
In Sections 2, 3, and 4 we have discussed how to generate structured
terms using a root- and rule-based approach taking advantage of syntactic,
semantic, and statistical cues. Though structured terms are useful
representations of concepts in a terminology, and though natural language
processing and other statistical tools can convert key natural language
phrases into structured terms, we have not yet discussed how these key
phrases are selected. It is possible, of course, to manually select key phrases
to be used in a terminology. In some cases, this is unavoidable, as there will
always be disagreement as to what constitutes an important term within a
domain, but it is helpful if at least some of the work can be done with an
automated system. The automated system may be helpful in providing an
empirical basis for coming to agreement.
Winter 2018
46
The study of automatic terminology extraction is a major area in the
field of information extraction (Witschel 2005). Most methods of
terminology extraction rely on two components: a unithood metric and a
termhood metric. A unithood metric determines the particular types of
words and phrases that make potential candidates for terms. A unit is not
necessarily the final representation of a term, nor are all units relevant
enough to be treated as terms. For example, a unithood metric might
consider all noun phrases (such as “the red leaf of a tree”) in a corpus to be
valid term candidates. The task of a termhood metric is to determine which
of the candidate terms are important enough within a document to be a part
of the terminology. Together, a unithood metric and a termhood metric can
extract all of the salient words and/or phrases from a document.
Most terminology extraction methods combine statistical and formal
methods. Unithood metrics are often based partially on linguistic features
such as part of speech. For example, it is uncommon to include isolated
prepositional phrases in a unithood metric (though prepositional phrases
that are included in other phrasal categories may be included). Termhood
metrics usually analyze the frequency of a term in a document with respect
to its frequency in a collection of documents in order to determine the extent
to which the term represents the content of the document. However,
termhood metrics can also take into account linguistic features; for example,
nouns with Greek or Latin endings such as /itis or /scopy may be more likely
to be technical terms in certain domains (Witschel 2005).
The proposed methods of syntactic analysis described in Section 2
can take as input a series of phrases extracted from a document or corpus
and convert them into structured terms. However, this algorithm is sensitive
to the terminology extraction techniques used, as it is dependent on the
interface between syntax and semantics. If the input phrases are too short to
provide semantic clarity, the output terminology will be too general. If the
input phrases are too long (with respect to the number of words), the output
terminology will be too specific. Many terminology extraction algorithms
only extract single morphemes or words — that is, they output terms such as
“solar” or “photovoltaic” (Witschel 2005). However, our proposed system
prefers units with two to five content words, as longer or shorter terms will
tend to be either too general or too specific for most use-cases. Longer terms
are more specific, and will often introduce nuances that are not necessary in
domain terminologies.
Bearing these restrictions in mind, there are still terminology
extraction algorithms that cater to the needs of structured terminologies.
Washington Academy of Sciences
47
Methods such as the NC-Value Algorithm (Frantzi, Ananiadou, and Tsujii
1998) are designed for extracting multi-word terms and their algorithm can
easily be extended to favor two- to five-word phrases in order to generate
the most effective structured terms.
Though our proposed system is sensitive to terminology extraction,
the exact algorithm used is an implementation detail that can be changed
easily, as discussed in Section 5. Different use cases may choose different
terminology extraction algorithms, depending on their needs. The root- and
rule-based approach that we describe is not specific to any terminology
extraction algorithm, and the exact method can be customized according to
the use case.
The root- and rule-based approach we propose also provides a
significant advantage for terminology extraction. Because root-based terms
can easily be broken down into their component pieces, it is possible to
compare two terms and find similarities between them. Because of this, it
is possible to use previously generated normalized terms as hints for term
selection. For example, given that the term RED-TREE_LEAF 1s salient in a
corpus, the terms RED-TREE BRANCH, GREEN-TREE LEAF, and RED-
BUSH_LEAF are probably salient as well, as they share much of the same
information.
5. Extensibility
The previous sections describe the various methods that go into
terminology generation. The major components are salient phrase
extraction (Section 4) and converting key phrases to structured terms
(Section 2 and Section 3). However, using this model in a complete system
is more complex.
One of the primary benefits of this root- and rule-based approach is
the compositional form of terms. Based on this approach, it is possible to
build an extensible and modular system that can be adjusted to suit different
needs. In this section we describe how the system as a whole can be
configured through different modules and extensions, and how this is
enabled by the rule-based model.
Figure 1 shows the various processes involved in generating terms
from a corpus of documents. These processes interact with three different
types of data: the corpus, the set of key phrases, and the terminology (stored
in the database). The corpus may be either a large set of documents used to
initialize the terminology, or a smaller set of documents introduced through
Winter 2018
48
a user interface, as discussed in Section 6. The set of key phrases are the
salient phrases extracted from the corpus; key phrases are natural language
phrases that have not yet been processed by the structuring methods
described in Sections 3 and 4. The terminology consists of any pre-
generated terms, which can be used as a training set. During the term
generation process, new terms are added to the existing terminology.
Working with these data requires many different tools and sub-
processes - key phrase extraction, tethered root generation, super root
generation, lemmatization, and term generation. A key feature of the design
is that these tools are not necessarily co-dependent, and can thus easily be
substituted depending on the needs of users or on advancements in the
technologies that constitute each component.
5.1 Key Phrase Extractor
The key phrase extractor has the task of extracting salient phrases
from the corpus. As discussed in Section 4, there are many different
algorithms that can handle this task. As such, it is possible to entirely replace
the key phrase extractor that is used in a root- and rule-based terminology
generation system.
The key phrase extractor may also take advantage of any terms
already in the term database by treating them as a training set. Different
terminology extraction algorithms may use this training data in different
ways. For example, some applications may only wish to include very close
matches with preexisting terms, while others may choose to be more liberal
with key phrase extraction. This allows different use-cases to use the
preexisting terms as appropriate.
5.2 Tethered Root Generator
Tethered roots (see Section 3) may be generated based on two major
components of the terminology generation system: the set of key phrases,
and the set of preexisting terms. A tethered root generator may use statistical
models to determine the information content of a given root relative to the
set of all key phrases, or it may use other tethered roots in preexisting terms
as cues to generate tethered roots from the current corpus. Again, this
component can be customized according to the use-case. One possibility is
using Shannon entropy (Shannon 1948) to identify sequences that add less
information to a dataset when split up into multiple roots than they would if
a tethered root were used instead. For example, if OAK-TREE-LEAF provides
less information than OAKTREE-LEAF, then OAKTREE-LEAF could be used
Washington Academy of Sciences
49
instead. Ideally, a tethered root generator will consider not only how much
information is contained in each variant, but also whether the information
is misleading or inconsistent.
5.3 Super Root Generator
Super root generation requires much of the same information as
tethered root generation, except that roots are more likely to make sense
when considered separately. Super root generation may also take advantage
of a dependency parser in order to determine super roots based on syntactic
Structure.
5.4 Lemmatizer
In order to avoid creating unnecessary terms, roots are lemmatized
to avoid codifying differences in grammatical form. There are many
different ways that words can be lemmatized, and lemmatization is a non-
trivial task in computational linguistics (Sharma 2010). One common
method is to use lexical databases such as WordNet (Fellbaum 1998), as in
our forthcoming reference implementation of root- and rule-based
terminology generation.
5.5 Term Generator
A term generator combines the roots that make up each key phrase
into a structured term based on the results of a dependency parser and the
methods described in Section 2 and Section 3. The rules in the rule-based
system we describe are not static and can be changed by users,
administrators, or developers when needed to improve the system’s
performance at its given task.
Altogether, these tools and sub-processes come together to form a
model of terminology generation that is customizable, takes advantage of
both linguistic and statistical facts, and is at the same time both machine-
and user-accessible.
5.6 Example
An example of the entire terminology generation process is shown
below, to illustrate how these components come together. This example
begins with the following document, taken from Overton Jr and Gaffney
(1955) (https://materialsdata.nist.gov/dspace/xmlui/handle/1 1256/79),
from which terms will be extracted.
Winter 2018
50
The ultrasonic pulse technique has been used in conjunction with a
specially devised cryogenic technique to measure the velocities of
10-Mc/sec acoustic waves in copper single crystals in the range from
4.2K to 300K. The values and the temperature variations of the
elastic constants have been determined. The room temperature
elastic constants were found to agree well with those of other
experimental works. Fuchs’ theoretical c44 at OK is 10 percent
larger than our observed value but his theoretical cll, cl2, K and
(cll—cl2) agree well with the observations. The isotropy, (cl1—
c12)2c44, was observed to remain practically constant from 4.2K to
180K, then to diminish gradually at higher temperatures. Some
general features of the temperature variations of elastic constants are
discussed.
A key phrase extractor then determines the most salient phrases in the
corpus and extracts them. Key phrase are shown in bold and underlined.
The ultrasonic pulse technique has been used in conjunction with
a specially devised cryogenic technique to measure the velocities of
10-Mc/sec acoustic waves in copper single crystals in the range
from 4.2K to 300K. The values and the temperature variations of
the elastic constants have been determined. The room temperature
elastic constants were found to agree well with those of other
experimental works. Fuchs’ theoretical c44 at OK is 10 percent
larger than our observed value but his theoretical cll, c12, K and
(cll—cl2) agree well with the observations. The isotropy, (cl 1—
c12)2c44, was observed to remain practically constant from 4.2K to
180K, then to diminish gradually at higher temperatures. Some
general features of the temperature variations of elastic constants are
discussed.
Once these key phrases have been extracted, they need to be
converted into normalized root- and rule-based terms. This involves
parsing, lemmatizing, and structuring the phrases according to the methods
discussed in Sections 2 and 3.
For example the phrase temperature variations of the elastic
constants should be parsed and converted into the following dependency
tree (Figure 6): |
Washington Academy of Sciences
variations
temperature of
constants
vil
elastic
/
the
Figure 6: Dependency representation of temperature variations of the
elastic constants
The words are then lemmatized and the syntactic structure
normalized such that the tree is entirely left-branching. At this time
unimportant function words such as of are also removed. This results in
Figure 7
variations
a
ae
ee
ene
temperature oot
Pg
say
constants
Fo
elastic
/
the
Figure 7: Normalized representation of temperature variations of the
elastic constants
Based on this structure, the system should generate the term
ELASTIC-CONSTANT-TEMPERATURE_ VARIATION. Because variation has two
branches in Figure 6, one of them must be used to generate a super-root in
order to preserve unambiguity. This yields TEMPERATURE_VARIATION
which can then compose normally to create the complete term ELASTIC-
Winter 2018
52
CONSTANT-TEMPERATURE VARIATION. The other key phrases can be put
through this same process, yielding ULTRASONIC-PULSE-TECHNIQUE,
COPPER-SINGLE CRYSTAL, ELASTIC-CONSTANT-TEMPERATURE_VARIATION,
and ROOM-TEMPERATURE-ELASTIC_CONSTANT, which are inserted into the
database as valid terms. Some discrepancies may result from this strategy;
for example, the above terms include both the structures ELASTIC-CONSTANT
and ELASTIC_CONSTANT. Such ambiguities are resolved through the use of
a training set or manual curation. For example if ELASTIC_CONSTANT is
found in the training set, the system can resolve this conflict.
5.7 Performance and Usage of Root- and Rule-Based Terms
The previous sections demonstrate how root- and rule-based terms
can be constructed using phrases extracted from natural language texts.
However, we have yet to analyze the performance of root- and rule-based
terms as data structures. One of the major advantages of these structures is
that they are capable of being constructed automatically using linguistic and
statistical methods, but the structure of the terms themselves provides
additional performance and usability gains for many tasks.
The example given in Section 5.6 shows how a small set of root- and
rule-based terms can be generated from a single document. If this process
is repeated over a larger sample of documents, the result is a large
terminology representing concepts in a particular domain. In order to
analyze this terminology, we examine the data that are represented by each
term.
To begin with, root- and rule-based terms are typically human-
readable and understandable. The terms COPPER-SINGLE_ CRYSTAL and
ULTRASONIC-PULSE-TECHNIQUE are fairly simple to understand, assuming
that the component parts are understood. Even without knowing what
ultrasonic means, it is still possible to gather that ULTRASONIC-PULSE-
TECHNIQUE refers to a technique involving ultrasonic pulses. To a human,
root- and rule-based terms may not always seem unambiguous -
ELASTIC_CONSTANT-TEMPERATURE_ VARIATION may initially be perceived
as referring to something of constant temperature rather than to the
temperature variations of the elastic constant. However, the correct
interpretation is typically apparent in English-language terms, as English
Washington Academy of Sciences
53
tends to follow a head-final structure’ (which is imposed on all root- and
rule-based terms).
From a computational perspective, root- and rule-based terms are
syntactically unambiguous. The term ELASTIC_CONSTANT-
TEMPERATURE_VARIATION refers specifically to the temperature variation
of the elastic constant; it cannot refer to elastic variations of constant
temperature or to any other alternative referents. This has several
implications for root- and rule-based terms. Firstly, two root- and rule-based
terms which contain the same roots but have different structure can be used.
The term ROOM-TEMPERATURE-ELASTIC CONSTANT is distinct from the
term ROOM-temperature-ELASTIC-CONSTANT - the former refers to the
elastic constant at room temperature, while the latter refers to a constant
relating to room temperature elastic (e.g. elastic material held at room
temperature). Because these two terms have distinct meanings and distinct
structures, both can be represented if necessary, and both can be generated
from natural language. Secondly, the structure of root- and rule-based terms
implies a larger semantic structure.
The two terms ROOM-TEMPERATURE-ELASTIC_CONSTANT and
ABSOLUTEZERO-ELASTIC_CONSTANT both contain the — super-root
ELASTIC CONSTANT and both refer to types of elastic constant, i.e. the
elastic constants at room temperature and the elastic constants at absolute
zero. However, other terms, such as ELASTIC_CONSTANT-
TEMPERATURE_ VARIATION also contain ELASTIC_CONSTANT but do not refer
to types of elastic constant. Instead, they refer to a type of temperature
variation. This can be determined from the structure of root- and rule-based
terms, allowing a computer to generate term hierarchies and semantic maps.
The structural and hierarchical nature of root- and rule-based terms
also allow for more powerful searches and analyses of data. Documents can
be indexed not just by the words that occur in them, but by salient concepts
they discuss. This allows a document describing ROOM-TEMPERATURE-
ELASTIC_CONSTANT to be distinguished from one describing ROOM-
TEMPERATURE-ELASTIC-CONSTANT - even though the roots that make up
these terms are the same, the two documents are describing different
concepts. Thus, a user searching in a collection of documents can specify
whether they are hoping to find documents on ROOM-TEMPERATURE-
ELASTIC_CONSTANT or _ on ROOM-TEMPERATURE-ELASTIC-CONSTANT.
3 In a head-final structure, the head phrase follows its dependents. For example, the noun
in a noun phrase follows the adjectives which modify it.
Winter 2018
54
Furthermore, because ROOM-TEMPERATURE-ELASTIC_CONSTANT is more
specific than just ELASTIC_CONSTANT, a user may also be able to search for
more general concepts as well. Similarly, given a general term such as
ELASTIC CONSTANT, a system can determine more specific related terms
such as ROOM-TEMPERATURE-ELASTIC_CONSTANT and ABSOLUTEZERO-
ELASTIC_CONSTANT.
6. Applying Root- and Rule-Based Terminologies
In previous sections we have discussed how we propose to build
domain terminologies using a root- and rule-based approach. We have
described how an algorithm can convert phrases of natural language into
structured terms, how key phrases can be extracted, and how this system
can be extended and modularized. In this section, we discuss why the root-
and rule-based approach we have proposed facilitates the creation of useful
terminologies.
It is non-trivial to measure the correctness of a domain terminology.
Standard metrics such as precision (the percentage of the output answers
that are desirable) and recall (the percentage of all desired answers that are
actually contained in the output) may not accurately assess problems
without definite answers. Terminologies are only desirable insofar as they
represent sets of useful concepts relating to a domain; different uses may
lead to different notions of desirability. A terminology does not represent
every possible concept used in a domain — instead, it represents only those
concepts that are of appropriate specificity for practical purposes,
depending on the use case.
This notion of practical validity does not have an objective definition
that can be directly measured. Instead, the use determines the validity of
terminology in a particular context. For example, a terminology that is based
entirely on taxonomy (a scheme of hierarchical categorization) may be
useful for some tasks, but for others, it may be desirable to represent
information about the properties of terms using a different scheme of
classification. In some cases, for example, it may be useful to know that RED
is a type of COLOR, while in others it will instead be useful to know that a
BALLOON has the property COLOR with value RED. The Root- and rule-based
approach and be adapted to support new ways of classification.
More generally, terminologies need to be available on demand to
many types of interaction with users. For example, in addition to end-users,
a terminology needs to be easily maintained by administrators. The
Washington Academy of Sciences
55
implementation of a terminology may be very efficient on the user end,
obtaining the results of search queries very quickly, but be inefficient on the
administration end. This is often a detail of implementation, but can be
relevant to the formulation of the terminology model.
The root-and-rule-based model that we have described is designed
to meet the needs of a variety of general use-cases and be configurable
enough to meet the needs of more specific situations. We have considered
four general use-cases in our proposal: document entry, document retrieval,
curation, and rule changes. These use cases assume a centralized database
containing the core terminology. The relationships between these use cases
and the data are shown in Figure 8. The terminology generator in Figure 8
is shown is the same system shown in Figure 1. The desirable components
of a ara system such as this are described in Section 1.2
neal Reiievel trieval
eta e
User
Document Entry | = a ‘Cusation uration
tat -rfac Interface
Term Database
Curators
—_— |
t
Term Generator Initial Corpus
[File Ghana Cc ea ee
Inte ea ee ce
Figure 8. A strategy for use-cases to create and manage root- and rule-
based terminologies
6.1 Document Entry Interface
We have proposed that a document entry interface should provide
functionality that allows a user to upload a document to the system. The
system should then determine the terms in the document that match pre-
existing terms, extract any new terms from the document, and allow the user
to edit these terms.
Winter 2018
56
We propose that a user should be able to edit the results of document
entry by removing those terms that do not apply to the uploaded document,
or by adding new terms that the algorithm did not find.
A document entry interface is dependent on the ability to quickly
identify terms in a single document. That is, an automated system that
allows for single document entry must be robust to corpora containing few
documents. The terminology must be able to be built one document at a
time, with no dependency on large corpora. Our proposed root-and rule-
based system would take advantage of redundant methods of terminology
extraction in order to work with different-sized corpora. In addition to
extracting terms with a more general-purpose terminology extraction
algorithm (see Section 4), our proposal also takes advantage of the
structured nature of terms to find terms in new documents that are
structurally similar to previous terms. This often allows the root- and rule-
based method to identify useful new terms even without statistical evidence.
6.2 Document Retrieval Interface
In our proposed document search and retrieval interface, a user
should be able to enter one or more terms and receive a listing of documents
containing the given terms. In the simplest case the user simply inputs a list
of terms, and the system locates all of the documents in the database
containing the given terms. More complex search systems are also possible,
allowing for additional refinement of search criteria.
Structured terms improve the potential of search systems. The
compositional nature of terms in this model means that users can make
semantic searches, rather than simply searching for the presence of a
collection unrelated words. That is, instead of searching for a document that
contains the words “red”, “tree”, and “leaf”, a user can easily identify the
exact concept in question and search for RED-TREE_ LEAF. This allows the
user to make much more succinct and semantically rich search queries,
producing narrower and more relevant result sets.
The system by which terms are generated from natural language
phrases can also be used to improve the usability of user search interfaces.
Rather than requiring that users search directly for structured terms in the
database, which requires an understanding of the way that root- and rule-
based terms are formed, the interface can instead allow users to input
phrases of natural language. For example, if the user inputs “the red leaves
of a tree”, the interface can quickly generate suggested terms, such as RED-
Washington Academy of Sciences
“|
TREE LEAF, meaning that users only need a passive knowledge of term
structure in order to make advanced use of the interface, while still allowing
for unambiguous searches.
6.3 Curation Interface
A curation interface allows a team of curators to make changes to
the terminology. Curators may either be a select team of administrators, or
volunteer end-users. That is, it is possible in some cases to crowd-source
curation. Depending on the resources available for a particular system, it
may be desirable to have dedicated moderators or to allow end-users to
make their own changes to the database. We make no suggestions as to
which is more generally preferable, but instead describe a terminology
system that is able to handle both.
Curation may be partially dependent on search, as discussed above,
in subsection 6.2, as curators need to be able to locate terms to change.
However, curators may benefit from more than a document retrieval system,
as they should be able to examine the complete structure of the terminology.
For example curators may wish to view taxonomic relationships between
terms in order to ensure that the taxonomy is structured correctly. Curators
should be able to make changes to the structure, as well as to individual
terms in the terminology.
A root- and rule-based terminology interacts well with this sort of
curation interface. The semantic nature of terms can be used to determine
the taxonomic structure of the terminology implicitly (as discussed in
Section 2, modifiers (roots other than the head) add specificity, and
unmodified terms are equivalent to hyponyms relative to their modified
variants). Additional relationships between terms are represented through
compounded terms (semantic clusters represented by terms that have been
combined using the delimiter ‘:’), allowing for graph-based visualizations,
such as that shown in Figure 9. A visualization such as this might be derived
from the following two terms: ELECTRON-CYCLOTRON-
CURRENT_DRIVE:NONINDUCTIVE-CURRENT_DRIVE and ION-CYCLOTRON-
CURRENT_DRIVE. By parsing these terms into roots, an algorithm or a user
can easily determine that both terms represent types of CURRENT_DRIVEs,
and that ELECTRON-CYCLOTRON-CURRENT DRIVE and ION-CYCLOTRON-
CURRENT DRIVE are types of CYCLOTRON-CURRENT_DRIVE. Furthermore,
the user interface can show that there is some relationship between
NONINDUCTIVE-CURRENT_DRIVE and ELECTRON-CYCLOTRON-
Winter 2018
58
CURRENT DRIVE. This information is all determined from the structure of
these three terms, and the composition of the roots.
CURRENT DRIVE
CYCLOTRON-CURRENT_DRIVE NONINDUCTIVE-CURRENT_DRIVE
ELECTRON-CYCLOTRON-CURRENT_DRIVE ION-CYCLOTRON-CURRENT_ DRIVE
Figure 9. Visualizing Term relationships
The implicit and explicit relationships between terms in this model
allow for changes to terms without requiring manual restructuring of the
hierarchical structure. For example, if the system erroneously created the
term RED-LEAF_TREE instead of RED-TREE_LEAF, a curator could make this
change without needing to manually relocate the term to its proper
taxonomic position beneath LEAF. Instead, the curator only needs to change
the term RED-LEAF TREE to RED-TREE_LEAF and the fact that RED-
TREE_LEAF Is a type of LEAF can be automatically inferred.
6.4 Rule Change Interface
Rule change interfaces are desirable in many evolving situations.
Due to the constant changes in knowledge and vocabulary that occur in all
domains, it is sometimes necessary to update the way that the rules are used
to generate terms. This interface may be particularly useful during the
infancy of this technology. In our proposal, it may be desirable to create
new delimiters with new meanings, change the behavior of delimiters, or
change the way that terms are selected. In addition, it may become feasible
to make changes due to improvements in technologies. In our proposal, this
process does not require a complete restructuring of the system. Instead,
only an interface which allows for the replacement of individual
components of the system, as described in Section 5, is necessary.
Washington Academy of Sciences
a9
6.5 Generating Ontologies
As discussed in Section 1, one of the motivations for a root- and
rule-based method of terminology generation is practical ontology
extraction for the semantic web as well as identifying, tracking, and
predicting changing trends. The root- and rule-based terminologies we have
discussed are well-adapted for use in ontologies. Though the methods we
describe do not immediately output a complete ontology, the structured
nature of terms in root- and rule-based terminologies means that a
terminology can easily be extended into the skeleton of an ontology. As
shown in Figure 9, some of the relationships between terms can be inferred
from term structure. These relationships are primarily taxonomic, but
compounded terms reveal additional connections that may inform the
structure of a complete ontology.
Some of the relationships that structured terms encode are
underspecified. For example, the term ELECTRON-CYCLOTRON-
CURRENT_DRIVE:NONINDUCTIVE-CURRENT_ DRIVE entails that there is a
relationship between ELECTRON-CYCLOTRON-CURRENT DRIVE — and
NONINDUCTIVE-CURRENT_ DRIVE. However, it does not specify what that
relationship is. An ontology may need to supply this additional information,
but the presence of the relationship is already known. Because of this, a
root- and rule-based terminology provides many of the relationships
necessary for a domain ontology. A system could generate a simple
ontology by adding labels to these relationships and adding any relevant
edges that may have been left out in terminology generation.
Ontologies are usually dependent on the terminologies that inform
them. By basing ontologies on root- and rule-based terminologies, it is
possible to create ontologies that are human- and machine-readable.
Because language, including the vocabulary used in most domains, is
constantly evolving, ontologies also need to evolve. Because root- and rule-
based terminologies are adaptable to new needs and to evolving vocabulary
as discussed in Section 5, they are superior to ad hoc terminologies for
constructing practical domain ontologies.
It is possible to partially automate the creation of domain ontologies
from terminologies. Statistical analyses such as co-word analysis (Kostoff
1993; Coulter et al. 1996; Coulter, Monarch, and Konda 1998) can suggest
relationships among the terms in a terminology, though not label these
relationships. These methods associate relationships between terms which
occur together within a fixed window. For example, if the terms ELECTRON-
Winter 2018
60
CYCLOTRON-CURRENT_ DRIVE and NONINDUCTIVE-CURRENT DRIVE occur
together more often than is expected based on the frequencies of the
individual terms, then it is likely that there is a semantic connection between
them. The above analyses do not automatically label these relationships, but
crowd-sourced methods can help to assign labels to common relationships
and create training sets for future automatic labeling techniques. Though
there is currently no fully-implemented application that extends root- and
rule-based terminologies into domain ontologies, a system that uses a root-
and rule-based approach as well as co-word analysis and crowd-sourced
labeling is currently in development. Moreover, the terminologies generated
by this approach can be used with current applications, both commercial
and freely available, to produce terminological networks much like the ones
produced by co-word analysis using tools such as Leximancer and Gephi
(Smith and Humphreys, 2006; Mathieu Bastian 2009).
7. Conclusion
Our root- and rule-based approach present several advantages for the
development of domain-based terminologies that are not available in semi-
structured models, while still maintaining both human- and machine-
readability. The primary advantage of root- and rule-based terms is that they
allow one to consistently and clearly represent important domain concepts.
Root- and rule-based terms are compositional, allowing for the division of
terms into their component parts for searching, selecting new terms, or
deriving relationships between terms.
Our proposed root- and rule-based model is also highly modular,
meaning that different users can easily adapt and maintain terminological
systems. Components may be updated with technological advances, the
introduction of new uses, or adaptations based on how systems are being
used in real-world situations. The model is also designed with many
important use-cases in mind, including search, document uploading, and
curation. This allows the system to be practical for the needs of different
users, for the administration, and for developers and scientists hoping to
expand upon a root- and rule-based system.
The model presented here is linguistically motivated, and follows
from many aspects of linguistic theory, including syntax, semantics, and
pragmatics, allowing it to connect on a fundamental level with the way that
humans actually use language, rather than with mathematical constructs
transparent only to machines. Like language itself, this model is
Washington Academy of Sciences
6]
evolutionary, use-based, and compositional, designed with practical needs
rather than purely theoretical constructs in mind.
References
Berners-Lee, Tim, James Hendler, and Ora Lassila. 2001. “The Semantic
Web.” Scientific American.
Bhat, Talapady N. 2010. “Building Chemical Ontology for Semantic Web
Using Substructures Created by Chem-BLAST.” J/nternational Journal
on Semantic Web and Information Systems 6 (3): 22-37.
Bhat, Talapady N., Laura M. Bartolo, Ursula R. Kattner, Carelyn E.
Campbell, and John T. Elliott. 2015. “Strategy for Extensible,
Evolving Terminology for the Materials Genome Initiative Efforts.”
Journal of Materials, no. 8: 1866-75.
Clark, Herbert H., and Deanna Wilkes-Gibbs. 1986. “Referring as a
Collaborative Process.” Cognition, no. 22: 1-39.
Coulter, Neal, Ira Monarch, and Suresh Konda. 1998. “Software
Engineering as Seen Through Its Research Literature: A Study in Co-
Word Analysis.” Journal of the Association for Information Science
and Technology 49 (13). New York, NY: John Wiley & Sons, Inc.:
1206-23.
Coulter, Neal, [ra Monarch, Suresh Konda, and Marvin Carr. 1996. “An
Evolutionary Perspective of Software Engineering Research Through
Co-Word Analysis.” CMU/SEI-95-TR-019. Pittsburgh, PA: Software
Engineering Institute, Carnegie Mellon University.
Feigenbaum, Lee, Ivan Herman, Tonya Hongsermeier, Eric Neumann, and
Susie Stephens. 2007. “The Semantic Web in Action.” Scientific
American.
Fellbaum, Christiane, ed. 1998. WordNet: An Electronic Lexical
Database. Cambridge, MA: MIT Press.
Frantzi, Katerina T., Sophia Ananiadou, and Jun-ichi Tsuji. 1998. “The
C-Value/NC-Value Method of Automatic Recognition for Multi- Word
Terms.” In Proceedings of the Second European Conference on
Research and Advanced Technology for Digital Libraries, 585—604.
London: Springer-Verlag.
Frege, Gottlob. 1884. The Foundations of Arithmetic. 1980th ed.
Evanston, IL: Northwestern University Press.
Winter 2018
62
Hall, Johan. 2006. “MaltParser: An Architecture for Labeled Inductive
Dependency Parsing.” Master’s thesis, Vaxj6 University.
Hearst, Marti A. 1992. “Automatic Acquisition of Hyponyms from Large
Text Corpora.” In Proceedings of the 14th Conference on
Computational Linguistics, 2:539-45.
Hudson, Richard A. 2004. “Are Determiners Heads?” Functions of
Language 11 (1): 7-42.
Knechtel, Martin, and Rafael Pefialoza. 2010. “A Generic Approach for
Correcting Access Restrictions to a Consequence.” In 7th Extended
Semantic Web Conference. The Semantic Web: Research and
Applications.
Koivunen, Marja-Riitta, and Eric Miller. 2001. “W3C Semantic Web
Activity.” In Semantic Web Kick-Off in Finland: Vision, Technologies,
Research, and Applications, 27-44. Helsinki Institute for Information
Technology.
Kostoff, Ronald N. 1993. “Co-Word Analysis.” In Evaluating R&D
Impacts: Methods and Practice, 63—78. Springer.
Liu, Xueqing, Yangqiu Song, Shixia Liu, and Haixun Wang. 2012.
“Automatic Taxonomy Construction from Keywords.” In ACM
Conference on Knowledge Discovery and Data Mining (Kdd 2012).
Beying, China.
Manning, Christopher D., Mihai Surdeanu, John Bauer, Jenny Finkel,
Steven J. Bethard, and David McClosky. 2014. “The Stanford
CoreNLP Natural Language Processing Toolkit.” In Proceedings of
52nd Annual Meeting of the Association for Computational
Linguistics: System Demonstrations, 55—60.
Mathieu Bastian, Mathieu Jacomy, Sebastien Heymann. 2009. “Gephi: An
Open Source Software for Exploring and Manipulating Networks.” In
International Aaai Conference on Weblogs and Social Media, 361-62.
Association for the Advancement of Artificial Intelligence.
“Modeling Across Scales.” 2015. Warrendale, PA: The Minerals, Metals
& Materials Society.
Montague, Richard. 1988. “The Proper Treatment of Quantification in
Ordinary English.” In Philosophy, Language, and Artificial
Intelligence, 2:141—62. Studies in Cognitive Systems. Amsterdam:
Springer Netherlands.
Washington Academy of Sciences
63
Nivre, Joakim. 2003. “An Efficient Algorithm for Projective Dependency
Parsing.” In Proceedings of the 8th International Workshop on
Parsing Technologies (Iwpt 03), 149-60. Nancy, France.
Overton Jr, W.C., and J. Gaffney. 1955. “Temperature Variation of the
Elastic Constants of Cubic Elements.” Phys. Rev. 98: 969-77.
Park, Youngja, Roy J. Byrd, and Branimir K Boguraev. 2002. “Automatic
Glossary Extraction: Beyond Terminology Identification.” In
Proceedings of the 19th International Conference on Computational
Linguistics (Coling ’02), 1:1—7.
Plant, Anne L., John T. Elliott, and Talapady N. Bhat. 2011. “New
Concepts for Building Vocabulary for Cell Image Ontologies.” BMC
Bioinformatics 12 (487).
Polguére, Alain, and Igor Mel’éuk, eds. 2009. Dependency in Linguistic
Description. Vol. 3. Studies in Language Companion Series. John
Benjamins.
Proux, Denys, Francois Rechenmann, Laurent Julliard, Violaine Pillet, and
Bernard Jacq. 1998. “Detecting Gene Symbols and Names in
Biological Texts: A First Step Toward Pertinent Information
Extraction.” Genome Informatics 9 (72-80).
Rindflesch, Thomas C., Lawrence Hunter, and Alan R. Aronson. 1999.
“Mining Molecular Binding Terminology from Biomedical Text.” In
Proceedings of the Amia 1999 Symposium, 127-31.
Shannon, Claude E. 1948. “A Mathematical Theory of Communication.”
Bell System Technical Journal 27 (3): 379-423.
Sharma, Deepik. 2010. “Stemming Algorithms: A Comparative Study and
Their Analysis.” [International Journal of Applied Information Systems
4 (3). New York: Foundation of Computer Science: 7-12.
Smith, Andrew E., and Michael S. Humphreys. 2006. “Evaluation of
Unsupervised Semantic Mapping of Natural Language with
Leximancer Concept Mapping.” Behavior Research Methods 38 (2):
262-79.
Swartz, Aaron. 2013. Aaron Swartz’s a Programmable Web: An
Unfinished Work. Morgan & Claypool.
Tesniére, Lucien. 1959. Eléments de Syntaxe Structurale. Paris:
Klincksieck.
Winter 2018
64
Witschel, Hans Friedrich. 2005. “Terminology Extraction and Automatic
Indexing - Comparison and Qualitative Evaluation of Methods.” In
Proceedings of Terminology and Knowledge (Tke), 1-12.
Wu, Ho Chung, Robert Wing Pong Luk, Kam Fai Wong, and Kui Lam
Kwok. 2008. “Interpreting TF-IDF Weights as Making Relevance
Decisions.” ACM Transactions on Information Systems (TOIS) 26 (3):
1-37.
Yoshida, M., Y. Sakamoto, H. Takenaga, S. Ide, N. Oyama, T. Kobayashi,
and Y. Kamada. n.d. “Rotation Drive and Momentum Transport with
Electron Cyclotron Heating in Tokamak Plasmas.” Phys. Rev. Lett.
103 (6). American Physical Society.
Washington Academy of Sciences
65
Appendix A: Rules
The enumerated rules listed below were originally published in Bhat ef al.
(2015).
If
Forming roots:
I.
oF
Use all roots in singular form except where plural form is used
more frequently.
Avoid using special characters (such as’: _ - =/\) asa part of
a root.
Avoid the use of modifiers as roots.
Use abbreviations only when they are widely accepted across
many related disciplines and when they are unambiguous in
their meaning. See Rule [itm:ambiguities] for exceptions when
acronyms are embedded in a super root. Use uppercase for all
acronyms except for atomic symbols.
For similar expressions choose a shorter equivalent as a root.
Forming super roots:
A super root is formed when the roots involved do not have a
preferred discriminating power and semantics to serve as node names
of a data-graph or as RDF elements except in special circumstances.
[.
Super roots are concatenated by an underscore to indicate its
compound semantics and its ability to be parsed into individual
roots only under unusual conditions. Ifa super root is comprised
of roots that are not specific when considered individually, then
refer to tethered roots (see Rule [itm:tethered]).
When a root of a super root functions like a hierarchical
classifier to another root then also include the classified root in
the super root so that automated parsers can recognize the
hierarchy. To order roots within a super root, unless there
already exists a well-accepted alternate convention, use rule
[itm:ordering].
Forming tethered roots:
a) Create tethered roots when a root is a qualifier of another root and
the semantics of any root on its own may not be of interest in a
database or data repository search. Tethered roots are formed to
indicate that the roots involved need be considered collectively,
rather than individually, in order to derive their semantics. For this
Winter 2018
66
reason, roots in a tethered root are written contiguously to avoid
inadvertent separation by automated methods. Since tethered
roots are comprised of qualifier and qualified roots, following a
general convention of root-based construction of English
language words, we use their intrinsic qualifier-qualified
relationship to order their roots.
b) A root may appear in more than one tethered root.
Tethered roots may also provide a way to avoid the use of stop words
in a compounded root. That is, move the word from the right of a stop
word to the left, drop the stop word, and place the qualifier before the
qualified.
Forming terms from roots: Terms are formed by concatenating two or
more roots, super roots or tethered roots using a hyphen (-) so that
automated methods may re-generate their roots when necessary. We
suggest to order roots of a term by classifier-classified relationships
(See Rule 6) which is also a general convention in English, as in
police dog or technical paper unless there is a different well accepted
convention.
Avoiding ambiguities and redundancies
a) Avoid using ambiguous acronyms. Instead clarify their meaning
by qualifying them with a classifier ‘root’ to form a super root or
a tethered root or use the complete phrase.
b) Avoid the inclusion of redundant words in a term.
Ordering roots in a term — classifier-classified rule: Roots (super root,
tethered root and root) within a term are organized by a left to right,
semantic top-down, classifier-classified hierarchy. In general,
classifier and classified roots are expected to have one-to-many
relationships where, in a rules-based approach, for example, the root
alloy is a classifier for many materials. Rule 16 deals with instances
where a relationship is not obvious or when a relationship changes
over time due to the addition of new terms. In short, a hierarchy is not
absolute but rather, it varies with the number of relevant use-cases.
a) One way to identify classifier and classified roots in a term is to
arrange the terms with an embedded hierarchical top-down, level-
based classifier (for each ‘classifier’ term there exists several
possibilities of ‘classified’ terms) statement with a hyphen
between classifier and classified terms (Cs
MODELINGSOFTWARE-VASP, MODELINGSOFTWARE-ABINIT). On
Washington Academy of Sciences
67
sorting these terms, classified roots appear as the fast varying
strings (VASP, ABINIT) and their classifier roots appear as the
slow varying term (MODELINGSOFTWARE). Automated methods
may use this feature to develop hierarchical data models that can
be presented as data-graphs or RDF or used for auto-complete to
select terms for reliable search results.
b) When a classifier-classified relationship does not exist among the
roots, place them in an alphabetical order.
Creating roots and terms with similar, multiple, or complex meanings:
Following rule 1, use a shorter root for words with similar meaning
whenever possible. A root embedded in a term can help automated
methods, such as co-word analysis, natural language processing, and
text-mining, to identify related semantic classes. To facilitate this
process, it is recommended: a) to limit the use of synonymous roots;
b) if necessary, clarify the semantics of a root by appending it with a
classifier-root.
Reusing terms to create compounded terms: Create terms by
combining roots so that terms have clear semantics. Avoid terms that
are broad and general in meaning. Create terms that can serve as
‘semantic expressions’ in use-cases. A rule of thumb is to attempt to
form terms with three roots and, if needed, combine between two and
five terms to form suitable semantic expressions.
Creating compounded terms that identify a group of objects in the
terminology: Compound terms serve as ‘use-cases’ defining semantic
expressions of terms and they are formed by concatenating two or
more terms using a colon (:) as a special delimiting character.
Compounded terms that are overly specific are unlikely to be reused.
It is advised to limit the number of terms in a compounded term to
between two and five terms. Compounded terms may point to
persistent identifiers (PIDs), such as DOIs (Digital Object Identifier)
for query purposes. Compounded terms may be used by database
providers or repository administrators to cluster, identify, and display
related items using messages like ‘related to items that you have
viewed’.
a) Use classifier-classified hierarchical Rule 6 to decide the order of
terms in a compounded term.
b) When creating compounded terms, give importance to ‘use case-
on-demand’ hierarchies, which are case-based rather than fixed
Winter 2018
68
10.
ae
IPA
i)
14.
lS,
16.
ge
schema-based hierarchies. Order a term so that a term to the left
has one-to-many relationship with the term to its right.
Providing the reference of any paper that supports the use of the new
term(s) you are creating. The reference may serve as a ‘definition’ of
the term as well may demonstrate use of the term within a context.
Design for readability of compounded terms: Use uppercase for the
first letter of a term and use lowercase for all the rest unless a root is
a short form or a symbol.
Provide usage statistics for terms: For each term in a database or
repository, store its usage statistics for users to inspect, along with the
terms. These frequencies may allow a user to avoid terms that are used
infrequently.
Provide semantic context of terms and compounded terms: In the
database, also keep and display a bibliographic reference and/or DOI
to illustrate the use and semantics of the term. This reference may also
be used as the basis to build use-case-specific compounded terms or
segments of data-graphs.
Identify new terms introduced by users as well as flag terms if no
documentation is provided. (See Rule 10)
Allow the creation of dialects: Terms that do not follow the rules may
also be created as local dialects when necessary. Dialects may
facilitate a gradual evolution of rule-based terminology and the rules
in a crowd-sourced environment.
Curate and validate terminology and compounded terms on a regular
basis: Dialects are important components of the proposed method for
terminology building. Therefore, accepting or removing dialects as
terminology must be facilitated by public resource providers who act
as caretakers. Redefining super roots, tethered roots and classifier-
classified relationships among roots are all important steps of the
evolution process of the proposed term building effort. Database
developers and repository administrators need to have an established
mechanism for regular updates to support a smooth evolution process.
Frequency of usage and the semantic context of terms are useful
factors to monitor in such an evolution process.
Apply new technologies that have been adopted widely: Explore
whether new data technologies may require the rules to be updated.
Washington Academy of Sciences
69
Appendix B: Noun Phrase Syntax and Semantics
One of the key components of any automated terminology
generation system is to find and represent important concepts. Typically,
the source of information that leads to these representations is a series of
natural language texts, such as a corpus of scientific articles. For our root-
and rule-based approach, representing these natural language descriptions
is dependent on an effective model of semantics, which we discuss in this
chapter. A complete model of natural language semantics (i.e. a perfect
representation of meaning) is most likely beyond the scope of contemporary
linguistics and computer science, so we will focus on a limited subset of
semantics — namely, the compositional semantics of noun phrases. For the
purposes of this paper, we consider a noun phrase to consist of a noun in
addition to all of its modifiers and any determiners (such as an, the, this,
and numbers). For example, the green vase is a noun phrase consisting of
an article (the), a modifier (the adjective green), and a noun (vase). Some
of the theory discussed in this section will apply to verb phrases and other
syntactic categories, but noun phrases are ideal in that they are relatively
concise and in that they are often clear representations of key concepts and
commonly appear as headwords or phrases in technical glossaries.
A very simple model of noun phrase semantics would be to simply
represent words as themselves or their lemma-forms. For example, the
words /eaf and leaves could both be represented as LEAF. For more complex
noun phrases, such as green leaf, this method becomes more problematic.
We could create a representation such as GREENLEAF, a term used
specifically for green leaves, but this model does not unambiguously reveal
that the meaning of GREENLEAF is related to the meanings of GREEN and
LEAF. For this reason, we also model the syntax of noun phrases. Syntax can
be represented in several ways, but two of the most common methods are
phrase structure grammar and dependency grammar. These models show
the relationships between words in a phrase, revealing for example that the
word green in the phrase green leaf is an adjective modifying the noun /eaf.
Syntax and compositional semantics are related concepts; the interpretation
of a phrase is derived partially from its syntactic structure (i.e. the
organization of the words in a phrase or sentence) (Montague 1988), and so
using syntax as a proxy for semantics is reasonable.
Though syntax may relate to semantics, it is not a perfect stand-in.
There is not a one-to-one relationship between syntax and semantics —
multiple syntactic forms can have the same semantic meaning, and the same
syntactic form can have multiple meanings. Because of this, deriving
Winter 2018
70
meaning purely from the way that words are put together can lead to
ambiguities. Consider the noun phrases in examples | through 4; though
these sentences have different syntactic structures and are composed of
different words, they have very similar meanings.
1) the tree’s red leaves
2) the tree’s leaves that are red
3) the leaves of the tree that are red
4) the red leaves of the tree
If we represent all of these phrases as the unordered set of
“important” words in each sentence, all of these sentences could be
represented as {tree, red, leaves}. However, unwanted phrases will also be
represented this way; for example, the red tree’s leaves will also be
represented as {tree, red, leaves}. Clearly some syntactic information is
necessary in order to create a sufficiently discriminating system. A simple
syntactic model is too discriminating; the words in these four sentences are
ordered differently, are contained within different syntactic constituents,
and are different in grammatical form (tree occurs both in its base form and
in its possessive form tree's). Thus, we need to develop a model of syntax
and semantics that normalizes these differences while still separating them
from other phrases. We will begin by discussing how common models of
syntax, such as dependency grammar, can be used as a basis for generating
structured terms.
B.1 Dependency Grammar
Modern Dependency Grammar (DG) was first described in Tesniére
(1959), and has since become one of the primary syntactic models used in
computational linguistics. The key concept in DG is, of course,
dependency, which is a one-to-one correspondence between morphemes in
which every morpheme is headed by some other morpheme. For the
purposes of this paper, we define a morpheme as the smallest unit of
language that has meaning, including simple unbound words such as /eaf as
well as affixes such as the -s in trees. The head of a phrase carries that
phrase’s syntactic category (i.e. the phrase the green vase behaves as a noun
and is considered a noun phrase because it is headed by vase, which is a
noun on its own). In DG, the head of a sentence is the verb, and all other
morphemes are either direct or indirect dependencies of the main verb. The
dependencies of a morpheme are those headed by it. For example, in the
phrase the green vase, vase is the head of green and the, so the dependencies
Washington Academy of Sciences
71
of vase are green and the. If green had its own dependencies, these would
be indirect dependencies of vase.
DG can be represented graphically as a tree structure with the verb
as the parent node and dependencies represented as daughter nodes.
However, since we are dealing exclusively with noun phrases, we will use
a noun as the parent and adjectives, determiners, relative clauses, and other
nouns as dependencies’.
The four noun phrases from examples (1) through (2) are
represented as dependency trees in Figures 10 through 13. Leaves plays the
role of the head because it is the primary (most general) concept represented
by the phrase
leaves
the of that
a *
tree are
(lait ari
the red
Figure 10: Dependency representation of “the leaves of the tree that are
red"
These trees represent both syntactic dependency and linear word
order. Each node in the tree represents a word; the dependencies of that
word are all daughter nodes. The root, at the top of the tree, is the head and
is not a dependency of any other words within the context of these phrasal
trees (namely, it does not qualify any other words). The word order can be
recovered through an in-order traversal (begin with the left-most node and
continue to the right).
‘i According to Hudson (2004), among others, phrases such as “the tree’s leaves” and “the green vase” are
actually determiner phrases and not noun phrases, and are headed by determiners such as “the” or “a”. Though there
is significant evidence for this, and most modern generative syntacticians prefer this analysis, we continue to use the
noun phrase analysis throughout this paper due to its more intuitive simplicity and for its continued prevalence in
computational linguistics. Furthermore, because we will end up ignoring determiners (see Appendix A.3), the
distinction between determiner phrase and noun phrase analyses does not have an effect on the model we describe
here.
Winter 2018
1?
leaves
a
the red of
tree
the
Figure 11; Dependency representation of “the read leaves of the tree”
leaf
ery
tree red
Figure 12: Nomralized dependency representation of “the tree’s red leaf”
As noted above, the syntactic and lexical information depicted by
these dependency trees is not sufficient to show that these four noun phrases
are similar enough to represent the same concept. Structurally speaking,
these trees are very different; only the root node (/eaves) has the same
position in every tree, and the remaining nodes are positioned almost
everywhere in the tree. This suggests that we cannot use DG on its own to
show how closely these sentences are related. However, in the following
two sections we will show how we can adapt DG to construct a model of
semantics that can be used to normalize noun phrases.
leaf
Lie ae
tree red
Figure 13: Normalized dependency representation of “the leaves of the tree
that are red”
B.2 The Semantics of Dependency
Since we are trying to build a model of semantics, it is necessary for
us to delve deeper into syntactic dependency in order to uncover the
underlying semantics of noun phrases.
Polguere and Mel’cuk (2009) describe syntactic and semantic
dependency as separate, but related concepts. They describes syntactic
Washington Academy of Sciences
73
dependency as the bridge between the semantic form of a sentence (its
meaning) and its morphological form (the linear string representation of
morphemes that is spoken or written). That is, though syntactic
dependencies have additional structure and complexity not apparent in
semantic dependencies, the two structures are related. We may, in fact, be
able to use the syntactic structure of a phrase to estimate its semantic
structure, with additional understanding of the relationship between these
two theoretical entities.
Semantics can be described (to a certain extent) using predicate logic
(Montague 1988). In terms of dependencies, a predicate’s dependents are
its arguments. These arguments may themselves be predicates with
additional dependents, allowing for recursive structures and complex
sentences (Polguere and Mel’éuk 2009). This is an imperfect model for our
purposes because it does not precisely correspond to syntactic dependency
(which is, computationally speaking, easier to derive) and because it divides
all morphemes into relationships (predicates) and entities (arguments). In
semantics, it is unclear whether words such as a should be treated as
predicates, arguments, or something else — in some cases, for example, they
are treated as quantifiers (cf. (Montague 1988)). However, Polguere and
Mel’Céuk (2009) state that all words in a particular phrase must be connected
in a semantic dependency structure, which allows for a more consistent
representation.
However, despite the disparity between syntactic and semantic
dependency described above, there are some commonalities that are
important to the development of the model described in this paper. One of
the functions of a predicate is to add specificity to its argument(s). In this
sense, we can finally observe a clear similarity between syntactic
dependencies and semantics: the dependencies of a morpheme almost
always add specificity to the meaning of that morpheme (green /eaf is more
specific than just /eaf).
Consider Figure 11. The phrase represented by this structure has the
meaning “leaves” on a very general level. On a more specific level, it is
clear that the leaves in question are possessed objects and that they are red.
The possession relationship is made more specific by the dependencies of
the ’s morpheme — the tree is the possessor of the leaf, and the tree is made
more specific through the article the, which shows that the tree in question
is a specific tree in the common ground between the speaker and the listener
(or between the writer and reader). We will refer to this as a semantic
Winter 2018
74
specification relationship, because the parent node is made more specific
through its daughters.
Note that unlike semantic dependency as described in Polguere and
Mel’éuk (2009), the specificity relationship always travels downward in the
tree, and perfectly matches syntactic dependency. Semantic specificity is an
ideal semantic model for our system, despite the fact that we still cannot
directly explain the similarities between the four noun phrases in examples
1 through 4, since the trees in Figures 10 and 11 are still different. In order
to explain these similarities, we need to normalize our representations.
B.3 Normalizing Similar Structures
The primary differences between the four example phrases we have
been discussing so far are found in the presence or absence of function
morphemes — that is, grammatical units (such as the possessive morpheme
’s) that do not correspond to any real-world meaning, but instead serve
primarily to indicate grammatical relationships. Though these morphemes
do affect the semantics of the overall phrase, they are primarily used to
allow for different word-orders and slightly nuanced meanings. However,
we can recover most of the meaning of a noun phrase without any of the
function morphemes.
We create the four “collapsed” trees in Figure 14 through Figure 17
by removing all of the function morphemes, and leaving only content
morphemes (grammatical units that do correspond to real-world meaning,
1.e. Some particular concept, action, or trait).
leaf
se
tree red
Figure 14: Collapsed representation of the tree’s red leaves
leaf
2 Ye
red tree
Figure 15; Collapsed representation of the tree’s leaves that are red
Washington Academy of Sciences
75
leaf
ne
red tree
Figure 16: Collapsed representation of the leaves of the tree that are red
leaf
vas
red tree
Figure 17: Collapsed representation of the red leaves of the tree
At this point, the similarities between these four structures are quite
clear: each one has /eaf (the head of the noun phrase) as the root, and two
dependencies: the words red and tree, albeit in different positions.
However, in our semantic model, linear order does not impact the meaning
of the noun phrase as a whole (only vertical order does). Thus, all four of
these structures produce the same meaning, namely that of a leaf specified
by both tree and red. Note that there is a trade-off to removing function
words: we do not know what type of relationship there is between /eaf and
tree, only that a relationship exists, or that it is a generic relationship such
as type-of (e.g. a tree-type-of-leaf). Relationships such as synonymy (words
that are synonyms), meronymy (words that represent a part of another
concept), and possession are not detected. However, we accept this trade-
off for our purposes: we are looking for important concepts, not specific
referents.
B.4 Representing the Model
The model of semantics described above can be represented
graphically using tree structures such as those above, in Figure 11 through
Figure 14. To do this in such a way that semantically similar phrases are all
represented the same way, we need a consistent way to order daughter
nodes. The method that we choose is largely irrelevant, so long as the order
of daughter nodes is independent of the phrase’s original word order. A
trivial method is to order the nodes alphabetically; this is what we will do
for now, meaning that sentences | through 4 can all be represented using
Figure 18.
Winter 2018
76
leaf
in
red tree
Figure 18: Universal representation for sentences I through 4)
However, this representation is not ideal. From a computational
standpoint, it is acceptable: it represents unambiguously all of the
information we need to know about a noun phrase, including the semantic
information that we want for generating terminologies. These structures can
be generated quickly and accurately using dependency parsers (Nivre
2003). However, they are not easily read by humans without linguistic
training. Reading a dependency tree requires an understanding of what a
dependency is, as well as how dependency trees are structured. For domain
taxonomies, data structures that are not human readable are undesirable.
Terminologies need to be read by humans as well as by machines, adding
an additional level of challenge to the problem of generating domain
terminologies.
We propose to solve this problem by building structured
compound nouns in such a way that they represent syntactic dependency
unambiguously while remaining human-readable. Linguistically speaking,
a compound noun is a noun composed of two or more other nouns, such as
dog house or airplane. A structured compound noun is a compound noun
formed through the application of regular, systematic rules. In English, the
way that two compound nouns are formed is not completely predictable.
Though dog house and bird house refer to houses for dogs and birds,
respectively, a fire house is not a house for fire in the same sense. Other
languages, however, have more productive compounding, meaning the
same way of creating a compound will have the same meaning in all cases.
In Sanskrit and German, for example, compound nouns often (but not
always) have predictable meanings based off of their components and how
they are combined. In other words, there is a set of rules that determines the
meaning of a compound. We can adapt this idea to our semantic model in
order to create an easy to read representation that follows from patterns in
natural language. Because structured compound nouns are structured
representations of phrasal semantics, they can be used to represent terms in
a terminology. This produces a terminology of structured terms with
predictable meanings based on roots and a set of rules used to combine
them.
Washington Academy of Sciences
V?
Our root- and rule-based approach does not use the same rules or
patterns as Sanskrit, German, or other languages with agglutinative noun
compounds, as the rules in these language still have many of the issues
associated with natural language more generally, including ambiguities and
inconsistencies. However, the processes of compounding and composition
play a major role in root- and rule-based terminologies.
Bios
T. N. Bhat is a project leader at NIST and one of his recent goals is to
develop tools and techniques to enable archiving, searching and sharing
scientific information.
Jacob Collard is a PhD student in computational linguistics at Cornell
University., where he specializes in interpretable computational models of
natural language syntax and semantics and the application of formal
methods to natural language processing. Since 2014 he has also been
working with the National Institute of Standards and Technology on
information retrieval using formal linguistic methods together with
conventional natural language processing tools.
Eswaran Subrahmanian is a Research Professor at the Engineering
Research Accelerator and Engineering and Public Policy at Carnegie
Mellon University CMU) and a Guest researcher at the Software and
systems Division at National Institute of Standards and Technology. He is
a member of the Design Society and the Washington Academy of
Sciences, a Distinguished Scientist of the ACM and Fellow of AAAS.
John T Elliott is the group leader of Cell Systems Science Group at
NIST. He is currently developing quantitative microscopy techniques for
measuring cellular response in a variety of applications.
Ursula R Kattner is a project leader at the Thermodynamics and Kinetics
Group at NIST. Some of her current research interests are Computational
thermodynamics, Alloy phase diagram evaluations, Metal-hydrogen
systems, Solder alloy systems, Superalloy systems.
Carelyn E. Campbell is the group leader of Thermodynamics and
Kinetics Group at NIST. Some of the projects she is currently leading are:
Winter 2018
78
Development of informatic tools and repositories for phase-based
materials property data, including thermodynamics, diffusion, molar
volume, and elastic properties.
Ram D. Sriram is currently the chief of the Software and Systems
Division, Information Technology Laboratory, at the National Institute of
Standards and Technology. Before joining the Software and Systems
Division at NIST, he was on the engineering faculty (1986-1994) at the
Massachusetts Institute of Technology (MIT). and was instrumental in
setting up the Intelligent Engineering Systems Laboratory.
Ira Monarch has investigated information design and process issues in
large-scale engineering programs, both military and industrial, for over
thirty years. At the Software Engineering Institute (SEI), he led and
participated in projects that developed and used text analytic tools
for uncovering patterns and identifying risks and failure conditions in
software design, architecture, development and maintenance.
Washington Academy of Sciences
79
Washington Academy of Sciences
1200 New York Avenue
Rm G119
Washington, DC 20005
Please fill in the blanks and send your application to the address above. We will
contact you as soon as your application has been reviewed by the Membership
Committee. Thank you for your interest in the Washington Academy of Sciences.
(Dr. Mrs. Mr. Ms)
Business Address
Home Address
Email
Phone
Cell Phone
preferred mailing address Type of membership
Business Home Regular Student
Schools of Higher Education attended Degrees
Present Occupation or Professional Position
Please list memberships in scientific societies — include office held
Winter 2018
80
oS)
te
Wi;
Instructions to Authors
Deadlines for quarterly submissions are:
Spring — February | Fall — August 1
Summer — May 1 Winter — November |
Draft Manuscripts using a word processing program (such as
MSWord), not PDF. We do not accept PDF manuscripts.
Papers should be 6,000 words or fewer. If there are 7 or more graphics,
reduce the number of words by 500 for each graphic.
Include an abstract of 150-200 words.
Include a two to three sentence bio of the authors.
Graphics must be in greytone, and be easily resizable by the editors to
fit the Journal’s page size. Reference the graphic in the text.
Use endnotes or footnotes. The bibliography may be in a style
considered standard for the discipline or professional field represented
by the paper.
Submit papers as email attachments to the editor or to
wasjournal(Qwashacadsc1.org .
Include the author’s name, affiliation, and contact information —
including postal address. Membership in an Academy-affiliated society
may also be noted. It is not required.
. Manuscripts are peer reviewed and become the property of the
Washington Academy of Sciences.
There are no page charges.
Manuscripts can be accepted by any of the Board of Discipline Editors.
Washington Academy of Sciences
81
Washington Academy of Sciences
Affiliated Institutions
National Institute for Standards & Technology (NIST)
Meadowlark Botanical Gardens
The John W. Kluge Center of the Library of Congress
Potomac Overlook Regional Park
Koshland Science Museum
American Registry of Pathology
Living Oceans Foundation
National Rural Electric Cooperative Association (NRECA)
Winter 2018
82
Membership List
AKSYUK, VLADIMIR A. 605 Gatestone Mews, Gaithersburg MD 20878 (F)
ANTMAN, STUART University of Maryland, 2309 Mathematics Building, College Park MD
20742-4015 (EF)
APPETITI, EMANUELA Botany Center, The Huntington, 1151 Oxford Road, San Marino CA
91108 (LM)
APPLE, DAINA DRAVNIEKS PO Box 905, Benicia Cal 94510-0905 (M)
ARSEM, COLLINS 3144 Gracefield Rd Apt 117, Silver Spring MD 20904-5878 (EM)
ARVESON, PAUL T. 6902 Breezewood Terrace, Rockville MD 20852-4324 (F)
BARBOUR, LARRY L. Pequest Valley Farm, 585 Townsbury Rd, Great Meadows NJ 07838 (M)
BARWICK, W. ALLEN 13620 Maidstone Lane, Potomac MD 20854-1008 (F)
BECKER, EDWIN D. 339 Springvale Road, Great Falls Va 22066 (EF)
BEHLING, NORIKO 6517 Deidre Terrace, McLean VA 22101 (M)
BERLEANT, DANIEL 12473 Rivercrest Dr., Little Rock AR 72212 (M)
BERRY, JESSE F. 2601 Oakenshield Drive, Rockville MD 20854 (M)
BIONDO, SAMUEL J. 10144 Nightingale St., Gaithersburg MD 20882 (EF)
BOISVERT, RONALD F. Mail Stop 8910, National Institute of Standards and Technology (NIST),
100 Bureau Drive Gaithersburg MD 20899-8910 (F)
BOSSE, ANGELIQUE P 11700 Stonewood Lane, Rockville MD 20852 (F)
BRADY, KATHIE 4539 Metropolitan Court, Frederick MD 21704 (M)
BRISKMAN, ROBERT D. 61 Valerian Court, North Bethesda MD 20852 (EF)
BROWN, ELISE A.B. 6811 Nesbitt Place, Mclean VA 22101-2133 (LF)
BUFORD, MARILYN P.O. Box 171, Pattison TX 77466 (EF)
BULLARD, JEFFREY WAYNE 11 Marquis Drive, Gaithersburg MD 20878 (F)
BYRD, GENE GILBERT Box 1326, Tuscaloosa AL 35403 (M)
CAVINATO, TIZIANA FCC, 7932 Opossumtown Pike, Frederick MD 21702 (M)
CIORNEIU, BORIS 20069 Great Falls Forest Dr., Great Falls VA 22066 (M)
CLINE, THOMAS LYTTON 13708 Sherwood Forest Drive, Silver Spring MD 20904 (F)
COBLE, MICHAEL NIST, 100 Bureau Drive, MS 8314, Gaithersburg MD 20899-8314 (F)
COFFEY, TIMOTHY P. 976 Spencer Rd., McLean VA 22102 (F)
COLE, JAMES H. 9709 Katie Leigh Ct, Great Falls VA 22066-3800 (F)
COUSIN, CAROLYN E. 1903 Roxburg Court, Adelphi MD 20783 (F)
CROSS, SUE 9729 Cheshire Ridge Circle, Manassas Va 20110 (M)
CUPERO, JERRI ANNE 2860 Graham Road, Falls Church VA 22042 (F)
CURRIE, S.J., CHARLES L. (Rev.) Jesuit Community, Georgetown University, Washington DC
20057 (EF)
DANNER, DAVID L. 1364, Suite 101, Beverly Road, McLean VA 22101 (F)
DAVIS, ROBERT E. 1793 Rochester Street, Crofton MD 21114 (F)
DEAN, DONNA 367 Mound Builder Loop, Hedgesville WV 25427-7211 (EF)
DEDRICK, ROBERT L. 21 Green Pond Rd, Saranac Lake NY 12983 (EF)
DHARKAR, POORVA 263 Congressional Lane Apt 412, Rockville MD 20852 (M)
DONALDSON, JOHANNA B. 3020 North Edison Street, Arlington VA 22207 (EF)
DOYLE, ELIZABETH K 6705B Overton Circle Apt. 16 Frederick MD 21703 (M)
DURRANI, SAJJAD 17513 Lafayette Dr, Olney MD 20832 (EF)
EDINGER, STANLEY EVAN Apt #1016, 5801 Nicholson Lane, North Bethesda MD 20852 (EM)
EGENREIDER, JAMES A. 1615 North Cleveland Street, Arlington VA 22201 (LF)
EPHRATH, ARYE R. 5467 Ashleigh Rd., Fairfax VA 22030 (M)
ERICKSON, TERRELL A. 4806 Cherokee St., College Park MD 20740-1865 (M)
ETTER, PAUL C. 8612 Wintergreen Court, Unit 304, Odenton MD 21113 (F)
FASANELLI, FLORENCE 4711 Davenport Street, Washington DC 20016 (EF)
FASOLKA, MICHAEL J. NIST Material Measurement Laboratory, MS8300, 100 Bureau Dr.,
Gaithersburg MD 20809 (F)
FAULKNER, JOSEPH A.2 Bay Drive, Lewes DE 19958 (EF)
Washington Academy of Sciences
83
FILLIBEN, JAMES JOHN NIST, 100 Bureau Dr., Stop 8980, Gaithersburg MD 20899-8980 (F)
FRASER, GERALD 5811 Cromwell Drive, Bethesda MD 20816 (M)
FREEMAN, ERNEST R. 5357 Strathmore Avenue, Kensington MD 20895-1160 (LEF)
FREHILL, LISA 1239 Vermont Ave NW #204, Washington DC 20005-3643 (M)
FROST, HOLLY C. 5740 Crownleigh Court, Burke VA 22015 (F)
GAGE, DOUGLAS W. XPM Technologies, 1020 N. Quincy Street, Apt 116, Arlington VA 22201-
4637 (M)
GARFINKEL, SIMSON L. 1186 N Utah Street, Arlington VA 22201 (M)
GAUNAURD, GUILLERMO C 4807 Macon Road, Rockville MD 20852-2348 (EF)
GHARAVI, HAMID National Institute of Standards and Technology (NIST), MS 8920,
Gaithersburg MD 20899-8920 (F)
GIBBON, JOROME 311 Pennsylvania Avenue, Falls Church VA 22046 (PF)
GIFFORD, PROSSER 59 Penzance Rd, Woods Hole MA 02543-1043 (F)
GLUCKMAN, ALBERT G. 18123 Homeland Drive, Olney MD 20832-1792 (EF)
GRAY, JOHN E. PO Box 489, Dahlgren VA 22448-0489 (M)
GRAY, MARY (Professor) Department of Mathematics, Statistics, and Computer Science,
American University, 4400 Massachusetts Avenue NW, Washington DC 20016-8050 (F)
GUIDOTTI, TEE L 2347 Ashmead PI., NW, Washington DC 20009-1413 (M)
HACK, HARVEY 176, Via Dante, Arnold MD 21012-1315 (F)
HAIG, SJ, FRANK R. (Rev.) Loyola University Maryland, 4501 North Charles St, Baltimore MD
21210-2699 (EF)
HARDIS, JONATHAN E. 356 Chestertown St., Gaithersburg MD 20878-5724 (F)
HAYNES, ELIZABETH D. 7418 Spring Village Dr., Apt. CS 422, Springfield VA 22150-4931
(EM)
HAZAN, PAUL 14528 Chesterfield Rd, Rockville MD 20853 (F)
HEANEY, JAMES B. 6 Olivewood Ct, Greenbelt MD 20770 (M)
HIETALA, RONALD 6351 Waterway Drive, Falls Church VA 22044-1322 (M)
HOFFELD, J. TERRELL 11307 Ashley Drive, Rockville MD 20852-2403 (F)
HOLLAND, PH.D., MARK A. 201 Oakdale Rd., Salisbury MD 21801 (M)
HONIG, JOHN G. 7701 Glenmore Spring Way, Bethesda MD 20817 (LF)
HORLICK, JEFFREY 8 Duvall Lane, Gaithersburg MD 20877-1838 (F)
HORN, JOANNE 1408 Grouse Court, 118 N. Market Street, Suite 201 Frederick, MD 21701,
Frederick MD 21703 (M)
HOWARD, SETHANNE Apt 311, 7570 Monarch Mills Way, Columbia MD 21046 (LF)
HOWARD-PEEBLES, PATRICIA 5701 Virginia Parkway 2312, McKinney TX 75071 (EF)
IKOSSI, KIKI 6275 Gentle LN, Alexandria VA 22310 (F)
IZADJOO, MEISAM 13137 Clarksburg Square Road, Clarksburg, MD 20871 (M)
IZADJOO, MINA 15713 Thistlebridge Drive, Rockville MD 20853 (F)
JOHNSON, EDGAR M. 1384 Mission San Carlos Drive, Amelia Island FL 32034 (LF)
JOHNSON, GEORGE P. 3614 34th Street, N.W., Washington DC 20008 (EF)
JOHNSON, JEAN M. 3614 34th Street, N.W., Washington DC 20008 (EF)
JONG, SHUNG-CHANG 8892 Whitechurch Ct, Bristow VA 20136 (LF)
KAHN, ROBERT E. 909 Lynton Place, Mclean VA 22102 (F)
KAPETANAKOS, C.A. 4431 MacArthur Blvd, Washington DC 20007 (EF)
KARAM, LISA 8105 Plum Creek Drive, Gaithersburg MD 20882-4446 (F)
KEISER, BERNHARD E. 2046 Carrhill Road, Vienna VA 22181-2917 (LF)
KLINGSBERG, CYRUS Apt. L184, 500 E. Marylyn Ave, State College PA 16801-6225 (EF)
KLOPFENSTEIN, REX C. 4224 Worcester Dr., Fairfax VA 22032-1140 (LF)
KOWTHA, VIJAYANAND 8009 Craddock Road, Greenbelt MD 20770 (F)
KRUEGER, GERALD P. Krueger Ergonomics Consultants, 4105 Komes Court, Alexandria VA
22306-1252 (EF)
LABOV, JAY B. Keck Center Room 638, 500 Fifth Street, NW, Washington DC 20001 (F)
LAWSON, ROGER H. 10613 Steamboat Landing, Columbia MD 21044 (EF)
LEIBOWITZ, LAWRENCE M. 2905 Saintsbury Place, #217, Fairfax VA 22031-1164 (LF)
Winter 2018
84
LEMKIN, PETER 148 Keeneland Circle, North Potomac MD 20878 (EM)
LESHUK, RICHARD 9004 Paddock Lane, Potomac MD 20854 (M)
LEWIS, DAVID C. 27 Bolling Circle, Palmyra VA 22963 (F)
LIBELO, LOUIS F. 9413 Bulls Run Parkway, Bethesda MD 20817 (LF)
LIDDLE, J ALEXANDER NIST, MS 6203, 100 Bureau Drive, Gaithersburg MD 20899-6200 (F)
LOCASCIO, LAURIE E National Institute of Standards and Technology, MS 1000, Gaithersburg
MD 20899 (F)
LONDON, MARILYN 3520 Nimitz Rd, Kensington MD 20895 (F)
LONGSTRETH, III, WALLACE I 8709 Humming Bird Court, Laurel MD 207231254 (EM)
LOOMIS, TOM H. W. 11502 Allview Dr., Beltsville MD 20705 (EM)
LOZIER, DANIEL W 5230 Sherier Place NW, Washington DC 20016 (F)
LUTZ, ROBERT J. 6031 Willow Glen Dr, Wilmington NC 28412 (EF)
LYONS, JOHN W. 7430 Woodville Road, Mt. Airy MD 21771 (EF)
MADHAVAN, GURUPRASAD 440 L St NW, Unit 1111, Washington DC 20001 (F)
MALCOM, SHIRLEY M. 12901 Wexford Park, Clarksville MD 21029-1401 (F)
MANDERSCHEID, RONALD W. 10837 Admirals Way, Potomac MD 20854-1232 (LF)
MANI, MAHESH 210 Summit Hall Rd, Gaithersburg MD 20877 (M)
MATHER, JOHN 3400 Rosemary Lane, Hyattsville MD 20782 (F)
MCGRATTAN, KEVIN B. 11512 Brandy Hall Lane, Gaithersburg MD 20878 (F)
MCNEELY, CONNIE L. School of Public Policy, George Mason University, 3351 Fairfax Dr Stop
3B1, Arlington VA 22201 (M)
MENZER, ROBERT E. 90 Highpoint Dr, Gulf Breeze FL 32561-4014 (EF)
MESSINA, CARLA G. 9800 Marquette Drive, Bethesda MD 20817 (EF)
METAILIE, GEORGES C. 18 Rue Liancourt, 75014 Paris , FRANCE (F)
MIGLER, KALMAN B. NIST, 100 Bureau Drive, Stop 8542, Gaithersburg, MD 20899 (F)
MILLER, JAY H. 8924 Ridge Place, Bethesda MD 20817-3364 (M)
MILLER H, ROBERT D. The Catholic University of America, 10918 Dresden Drive, Beltsville
MD 20705 (M)
MORRIS, JOSEPH PO Box 3005, Oakton VA 22124-9005 (M)
MORRIS, P.E., ALAN 4550 N. Park Ave. #104, Chevy Chase MD 20815 (EF)
MOUNTAIN, RAYMOND D.701 King Farm Blvd #327, Rockville MD 20850 (F)
MUELLER, TROY J. 42476 Londontown Terrace, South Riding Va 20152 (M)
MUMMA, MICHAEL J. 210 Glen Oban Drive, Arnold MD 21012 (F)
MURDOCH, WALLACE P. 65 Magaw Avenue, Carlisle PA 17015 (EF)
NEUBAUER, WERNER G. Apt 349, 7820 Walking Horse Circle, Germantown TN 38138 (EF)
NOE, ADRIANNE 9504 Colesville Road, Silver Spring MD 20901 (F)
O'HARE, JOHN J. 108 Rutland Blvd, West Palm Beach FL 33405-5057 (EF)
OHRINGER, LEE 5014 Rodman Road, Bethesda MD 20816 (EF)
OTT, WILLIAM R 19125 N. Pike Creek Place, Montgomery Village MD 20886 (EF)
PARR, ALBERT C 2656 SW Eastwood Avenue, Gresham OR 97080-9477 (F)
PAULONIS, JOHN J P.O. Box 703, Mohegan Lake NY 10547 (M)
PAZ, ELVIRA L. 172 Cook Hill Road, Wallingford CT 06492 (LEF)
PERSILY, ANDREW K NIST, Mailstop 8630, 100 Bureau Drive, Gaithersburg MD 20899 (F)
PICKHOLTZ, RAYMOND L. 3613 Glenbrook Road, Fairfax VA 22031-3210 (EF)
PLESNIAK, MICHAEL W. 1400 Laurel Dr., Accokeek MD 20607 (F)
POLAVARAPU, MURTY 10416 Hunter Ridge Dr., Oakton VA 22124 (LF)
POLINSKI, ROMUALD Prof, Doctor of Sciences (Economics), Ul. Generala Bora 39/87, 03-982
WARSZAWA 131 , Poland (M)
PRZYTYCKI, JOZEF H. (Prof.) 10005 Broad St, Bethesda MD 20814 (F)
PYKE, JR, THOMAS N. 4887 N. 35th Road, Arlington VA 22207 (EF)
RANSOM, BARBARA 3117 8th North, Arlington VA 22201 (M)
REGLI, WILLIAM Department of Computer Science, Institute for Systems Research, Clark School
of Engineering, 2173 A.V. Williams Building, 8223 Paint Branch Drive, University of
Maryland, College Park MD 20742 (F)
Washington Academy of Sciences
85
REISCHAUER, ROBERT 5509 Mohican Rd, Bethesda MD 20816 (EF)
RICKER, RICHARD 12809 Talley Ln, Darnestown MD 20878-6108 (F)
RIDGELL, MARY P.O. Box 133, 48073 Mattapany Road, St. Mary's City MD 20686-0133 (LM)
ROBERTS, SUSAN Ocean Studies Board, Keck 607, National Research Council, 500 Fifth Street,
NW, Washington DC 20001 (F)
ROGERS, KENNETH 355 Fellowship Circle, Gaithersburg MD 20877 (LM)
ROOD, SALLY A PO Box 12093, Arlington VA 22219 (F)
ROSENBLATT, JOAN R. 701 King Farm Blvd, Apt 630, Rockville MD 20850 (EF)
SANDERS, JAY 7850 Westmont Lane, McLean VA 22102 (F)
SAUBERMAN, P.E., HARRY R 8810 Sandy Ridge Ct., Fairfax VA 22031 (M)
SCHMEIDLER, NEAL F. 7218 Hadlow Drive, Springfield VA 22152 (F)
SELKIRK, WILLIAM 2423 Wynfield Ct, Frederick MD 21702 (M)
SENKEVITCH, EMILEE 1015 Columbine Drive, Apt 2B, Frederick MD 21701 (M)
SERPAN, CHARLES Z5510 Bradley Blvd, Bethesda MD 20814 (EM)
SEVERINSKY, ALEX J. 4707 Foxhall Cres NW, Washington DC 20007-1064 (EM)
SHAFRIN, ELAINE G. 8100 Connecticut Ave NW Apt 1014, Washington DC 20815-2817 (EF)
SHROPSHIRE, JR, W. Apt. 426, 300 Westminster Canterbury Dr., Winchester VA 22603 (LF)
SIMMS, JAMES ROBERT (Mr.) 9405 Elizabeth Ct., Fulton MD 20759 (M)
SLUZKI, CARLOS 5302 Sherier P! NW, Washington DC 20016 (F)
SMITH, THOMAS E 3148 Gracefield Rd Apt 215, Silver Spring MD 20904-5863 (LF)
SNIECKUS, MARY 1700, Dublin Dr., Silver Spring MD 20902 (F)
SODERBERG, DAVID L. 403 West Side Dr. Apt. 102, Gaithersburg MD 20878 (EM)
SOLAND, RICHARD M. 2516 Arizona Av Apt 6, Santa Monica CA 90404-1426 (LF)
SPARGO, WILLIAM J. 9610 Cedar Lane, Bethesda MD 20814 (F)
STAVELEY, JUDY 880 Laval Drive, Sykesville MD 21784 (M)
STERN, KURT H. 103 Grant Avenue, Takoma Park MD 20912-4328 (EF)
STIEF, LOUIS J. 332 N St., SW., Washington DC 20024-2904 (EF)
STILES, MARK D. 11506 Taber Street, Silver Spring MD 20902 (F)
STOMBLER, ROBIN Auburn Health Strategies, 3519 South Four Mile Run Dr., Arlington VA
22206 (M)
SUBRAHMANIAN, ESWARAN 4740 Connecticut Avenue, Apt #815, Washington DC 20008
(LM)
TEICH, ALBERT H. PO Box 309, Garrett Park MD 20896 (EF)
THEOFANOS, MARY FRANCES 7241 Antares Drive, Gaithersburg MD 20879 (M)
THOMPSON, CHRISTIAN F. 278 Palm Island Way, Ponte Vedra FL 32081 (LF)
TIMASHEV, SVIATOSLAV (SLAVA) A. 3306 Potterton Dr., Falls Church VA 22044-1603 (F)
TORAIN HI, DAVID S 1313 Summerfield Drive, Herndon VA 20170 (M)
TOUWAIDE, ALAIN Botany Center, The Huntington, 1151 Oxford Road, San Marino CA 91108
(LF)
TROXLER, G.W. PO Box 1144, Chincoteague VA 23336-9144 (F)
UBELAKER, DOUGLAS H. Dept. of Anthropology, National Museum of Natural History,
Smithsonian Institution, Washington DC 20560-0112 (F)
UMPLEBY, STUART (Professor) Apt 1207, 4141 N Henderson Rd, Arlington VA 22203 (F)
VARADI, PETER F. Apartment 1606W, 4620 North Park Avenue, Chevy Chase MD 20815-7507
(EF)
VAVRICK, DANIEL J. 10314 Kupperton Court, Fredricksburg VA 22408 (F)
VOAS, JEFFREY 8210 Crestwood Heights Drive, Apartment 720, McLean VA 22102 (M)
VOORHEES, ELLEN 100 Bureau Dr., Stop 8940, Gaithersburg MD 20899-8940 (F)
WALDMANN, THOMAS A. 3910 Rickover Road, Silver Spring MD 20902 (F)
WANG, Y. CLAIRE 140 Charles Street, Apt 22D, New York NY 10014 (M)
WEBB, RALPH E. 21-P Ridge Road, Greenbelt MD 20770 (EF)
WEISS, ARMAND B. 6516 Truman Lane, Falls Church VA 22043 (LF)
WERGIN, WILLIAM P. | Arch Place #322, Gaithersburg MD 20878 (EF)
WHITE, CARTER 12160 Forest Hill Rd, Waynesboro PA 17268 (EF)
Winter 2018
86
WIESE, WOLFGANG L. 8229 Stone Trail Drive, Bethesda MD 20817 (EF)
WILLIAMS, CARL 2272 Dunster Lane, Potomac MD 29854 (F)
WILLIAMS, E. EUGENE Dept. of Biological Sciences, Salisbury University, 1101 Camden Ave,
Salisbury MD 21801 (M)
WILLIAMS, JACK 6022 Hardwick Place, Falls Church VA 22041 (F)
Washington Academy of Sciences
87
ed ee ee ee
Winter 2018
88
Delegates to the Washington Academy of Sciences
Representing Affiliated Scientific Societies
Acoustical Society of America
American/International Association of Dental Research
American Assoc. of Physics Teachers, Chesapeake
Section
American Astronomical Society
American Fisheries Society
American Institute of Aeronautics and Astronautics
American Institute of Mining, Metallurgy & Exploration
American Meteorological Society
American Nuclear Society
American Phytopathological Society
American Society for Cybernetics
American Society for Microbiology
American Society of Civil Engineers
American Society of Mechanical Engineers
American Society of Plant Physiology
Anthropological Society of Washington
ASM International
Association for Women in Science
Association for Computing Machinery
Association for Science, Technology, and Innovation
Association of Information Technology Professionals
Biological Society of Washington
Botanical Society of Washington
Capital Area Food Protection Association
Chemical Society of Washington
District of Columbia Institute of Chemists
District of Columbia Psychology Association
Eastern Sociological Society
Electrochemical Society
Entomological Society of Washington
Geological Society of Washington
Historical Society of Washington DC
Human Factors and Ergonomics Society
(continued on next page)
Paul Arveson
J. Terrell Hoffeld
Frank R. Haig, S. J.
Sethanne Howard
Lee Benaka
David W. Brandt
E. Lee Bray
Vacant
Charles Martin
Vacant
Stuart Umpleby
Vacant
Vacant
Daniel J. Vavrick
Mark Holland
Vacant
Toni Marechaux
Jodi Wesemann
Vacant
F. Douglas
Witherspoon
Vacant
Vacant
Chris Puttock
Keith Lempel
Vacant
Vacant
Vacant
Ronald W.
Mandersheid
Vacant
Vacant
Jurate Landwehr
Vacant
Gerald Krueger
Washington Academy of Sciences
Delegates to the Washington Academy of Sciences
Representing Affiliated Scientific Societies
(continued from previous page)
Institute of Electrical and Electronics Engineers, Washington
Section
Institute of Food Technologies, Washington DC Section
Institute of Industrial Engineers, National Capital Chapter
International Association for Dental Research, American
Section
International Society for the Systems Sciences
International Society of Automation, Baltimore Washington
Section
Instrument Society of America
Marine Technology Society
Maryland Native Plant Society
Mathematical Association of America, Maryland-District of
Columbia-Virginia Section
Medical Society of the District of Columbia
National Capital Area Skeptics
National Capital Astronomers
National Geographic Society
Optical Society of America, National Capital Section
Pest Science Society of America
Philosophical Society of Washington
Society for Experimental Biology and Medicine
Society of American Foresters, National Capital Society
Society of American Military Engineers, Washington DC
Post
Society of Manufacturing Engineers, Washington DC
Chapter
Society of Mining, Metallurgy, and Exploration, Inc.,
Washington DC Section
Soil and Water Conservation Society, National Capital
Chapter
Technology Transfer Society, Washington Area Chapter
Virginia Native Plant Society, Potowmack Chapter
Washington DC Chapter of the Institute for Operations
Research and the Management Sciences (WINFORMS)
Washington Evolutionary Systems Society
Washington History of Science Club
Washington Paint Technology Group
Washington Society of Engineers
Washington Society for the History of Medicine
Washington Statistical Society
World Future Society, National Capital Region Chapter
Richard Hill
Taylor Wallace
Neal F. Schmeidler
Christopher Fox
Vacant
Richard
Sommerfield
Hank Hegner
Jake Sobin
Vacant
John Hamman
Julian Craig
Vacant
Jay H. Miller
Vacant
Jim Heaney
Vacant
Larry S. Millstein
Vacant
Marilyn Buford
Vacant
Vacant
E. Lee Bray
Erika Larsen
Richard Leshuk
Alan Ford
Meagan Pitluck-
Schmitt
Vacant
Albert G. Gluckman
Vacant
Alvin Reiner
Alain Touwaide
Michael P. Cohen
Jim Honig
Washington Academy of Sciences
Room GL117
1200 New York Ave. NW
Washington, DC 20005
Return Postage Guaranteed
epee tele MG Tee eral
B*7*nggrtreeneteneeetAUTO™MIXED ADC 207
HARVARD LAW S LIB ERSMCZ
LANGDELL HALL 152 i
1545 MASSACHUSETTS AVE
_ CAMBRIDGE MA 02138-2903
NONPROFIT ORG
US POSTAGE PAID
MERRIFIELD VA 22081
PERMIT# 888