Volume 104

Number 4

Winter 2018

Journal of the

MCZ LIBRARY

APR 1 4 2019

ACADEMY OF SCIENCES = papyagounivessiry

WASHINGTON

Oa aes RAE MAUR T CANES Ted ULC TI Coe err ao ics cas ac oes ae pu chives coin ea hvadis vend vee casvbtincnabnccesteduente ii

Bet NaN RRR NaN AMIR ECAR E CNS eo os cacy eso sede pinsvspeckdasvsvennontcuscdsiprudinsvengsessenstnseivantoneconss iil

Administrative Vice President Report 7. LOMgstreth. o.......cccccsecssssssesseecsseesneesessseecneesnneennecenes 1

DAeONS GATE G (SMAG ES es ea ee a ee vi

Determine Bullet Trajectory Reconstruction in se PETAR ota 5G ar hasten aaa anerpsdascy Psat 14

Reusable Models of Manufacturing Processes Vf. Mani 0 CL. ...e.ccecscecsesssesseesseesseeseeseenes 21

Generating Domain Terminologies J. Collard et Ol. ......ccceccecssesseessesssesssesneesseesesreesneesneenneenees 31

aa a Sct PMMA MES SEN IN AR RES EN etc ote M Tora ck asec ek rnc a /ncny se csnvevcsn secon cancbnbncoennsaos sueipesadedantesivacstoued 19

Sane RNIN RR NEMS GIN aos og so ch vcs db apna vacdSagssneskesttbasbspeaidbutvisatichaassunNiadouteonss 20

Ne is NAAN RN MECN De eg cece ct arn 8c vavbnn vtec cscndecobnanscesoroldnspdbendbuctBerovectinse 21

a an OR TO gag era argc cite 20cko doce las ohesccsnbsvonedvcasdeubdovnnssacennvtnesheass

fog y eS a ePPLL PeESWS 7G fe [Ilo fe] oP 2 a Gn

ISSN 0043-0439 Issued Quarterly at Washington DC

Washington Academy of Sciences

Founded in 1898

BOARD OF MANAGERS

Elected Officers

President

Mina Izadjoo

President Elect

Judy Staveley

Treasurer

Ronald Hietala

Secretary

Lynnette Madsen

Vice President, Administration

Terry Longstreth

Vice President, Membership

Ram Sriram

Vice President, Junior Academy

Paul Arveson

Vice President, Affiliated Societies

Gene Williams

Members at Large

Michael Cohen

Frank Haig, S.J.

Mahesh Mani

Kathe C. Brady

Elizabeth Doyle

Past President

Sue Cross

AFFILIATED SOCIETY DELEGATES

Shown on back cover

Editor of the Journal

Sethanne Howard

Journal of the Washington Academy of

Sciences (ISSN 0043-0439)

Published by the Washington Academy of

Sciences

email: wasjournal@washacadsci.org

website: www.washacadsci.org

The Journal of the Washington Academy

of Sciences

The Journal is the official organ of the

Academy. It publishes articles on science

policy, the history of science, critical reviews,

original science research, proceedings of

scholarly meetings of its Affiliated Societies,

and other items of interest to its members. It

is published quarterly. The last issue of the

year contains a directory of the current

membership of the Academy.

Subscription Rates

Members, fellows, and life members in good

standing receive the Journal free of charge.

Subscriptions are available on a calendar year

basis, payable in advance. Payment must be

made in US currency at the following rates.

US and Canada $30.00

Other Countries $35.00

Single Copies (when available) $15.00

Claims for Missing Issues

Claims must be received within 65 days of

mailing. Claims will not be allowed if non-

delivery was the result of failure to notify the

Academy of a change of address.

Notification of Change of Address

Address changes should be sent promptly to

the Academy Office. Notification should

contain both old and new addresses and zip

codes.

Postmaster:

Send address changes to WAS, Rm GL117,

1200 New York Ave. NW

Washington, DC 20005

Academy Office

Washington Academy of Sciences

Room GL117

1200 New York Ave. NW

Washington, DC 20005

Phone: (202) 326-8975

Volume 104

Number 4

Winter 2018

Journal of the

WASHINGTON

ACADEMY OF SCIENCES

Editor's Comments 5S. Howard

ARCH OP ISI TTI” ECUINOUS c.tnes sc as Std haste ised dA mentees ede As iii

Administrative Vice President Report 7. LONGStreth. .....c.cccccccccccscccssessessesessesseseesesseneesesveseseeveens 1

Pau ARMINIA DN decease tes 7 locece Madera end euacens oh cong tan Rid d egeae tad apts easier as sel ta cate eanpeahiee meee ig

Determine Bullet Trajectory Reconstruction L. CHANCE. cocccccccessecssesesssestecsscssessssseentesseeneens 11

Reusable Models of Manufacturing Processes M. Mani et 1. o..ccccceccccccceccsssessessesseeseeeeees 21

Generating Domain Terminologies J. Collard et Al. ....ceeccccccccccsssessssessesseeseesseseessesesseenseucenveens 31

yuleailessedil)emy ats) o] (res) | (o/| en emenMner coc tran crsMrcr aerated eee. 6 ore res)

BERGEN URE REIN RIN EN CRUSH a hon fe ccctonecec cag spavigus ar eres hv easds A raglan tortor sete rcautanttia bsgealleiy aay isamacasietese 80

PUL RU ATCC RT MRA TE PECL ENCORE etre tAeot oer orsocsoscc reer tlewetl svete ome Rerens steers est resets snsaibast ea tovelnseentntoat gt vebsassa 81

leer Shiles BISWA MLE P eect IE NOs. win bdininien dim wnnaiorememaune

Atiliated SOCIGIOS ANG DElGGAtES oi oiisnnteniccsercncineiin tstsounnns sterilised

ISSN 0043-0439 Issued Quarterly at Washington DC

Spring 2018

EDITOR’S COMMENTS

Presenting the 2018 Winter issue of the Journal of the Washington Academy

of Sciences.

For this issue we have our first (since I can remember) letter to the

editor. I encourage people to write letters to the editor. Please send email

(wasjournal@washacadsci.org) comments on papers, suggestions for

articles, and ideas for what you would like to see in the Journal.

We start with a column by one of our Board members: the

Administrative Vice President. This is a good addition to the Journal and

informative as well. Perhaps such columns can become a regular part of the

Journal.

To follow is a book review of Accessory to War: The Unspoken

Alliance between Astrophysics and the Military authors Neil de Grasse

Tyson and Avis Lang.

Next up is a student paper by Lydia Chance from Frederick

Community College. We encourage student papers and help the student to

learn about writing a scientific paper.

Then a two multi-author papers: one on computer search engines for

natural language documents, the other on reusable models of manufacturing

processing.

Every winter we print a list of members and addresses. Please check

to see that you are listed correctly. The Academy covers the greater

Washington DC area including parts of Virginia and Maryland. Most of our

members live in Maryland.

The Journal is the official organ of the Academy. Please consider

sending in technical papers, review studies, announcements, and book

reviews.

We are a peer reviewed journal and need volunteer reviewers. If you

would like to be on our reviewer list please send email to the above address

and include your specialty.

Sethanne Howard

Washington Academy of Sciences

iil

Journal of the Washington Academy of Sciences

Editor Sethanne Howard showard@washacadsci.org

Board of Discipline Editors

The Journal of the Washington Academy of Sciences has an 11-member

Board of Discipline Editors representing many scientific and technical

fields. The members of the Board of Discipline Editors are affiliated with a

variety of scientific institutions in the Washington area and beyond —

government agencies such as the National Institute of Standards and

Technology (NIST); universities such as Georgetown; and professional

associations such as the Institute of Electrical and Electronics Engineers

(IEEE).

Anthropology

Astronomy

Biology/Biophysics

Botany

Chemistry

Emanuela Appetiti eappetiti@hotmail.com

Sethanne Howard sethanneh@msn.com

Eugenie Mielczarek mielczar@physics.gmu.edu

Mark Holland maholland@salisbury.edu

Deana Jaber djaber@marymount.edu

Environmental Natural

Sciences

Health

History of Medicine

Operations Research

Science Education

Systems Science

Terrell Erickson terrell.erickson] @wdc.nsda.gov

Robin Stombler rstombler@auburnstrat.com

Alain Touwaide atouwaide@hotmail.com

Michael Katehakis mnk@rci.rutgers.edu

Jim Egenrieder jim@deepwater.org

Elizabeth Corona elizabethcorona@gmail.com

Spring 2018

Letters to the Editor

From: Jeff Bullard, Fellow, WAS

| was disappointed in reading article contributed by C. Sluzki entitled “The

Impact of Authoritarian Regimes”, in Volume 104, Issue 3, pp. 11-18. That

article contains, among other disturbing passages, the following paragraph

at the top of p. 14:

These are worrisome times. Far right, ethnic-nationalist, populist, racist,

sexist, anti-immigrants (sic), anti-abortion rights, anti-ecological, anti-free

speech, post-facts (post-truth!), authoritarian candidates and governments

are gaining strength world-wide. We are facing a world being

progressively seized by charismatic leaders who may not yet be tyrants with

a simplified polarizing discourse capable of perpetrating enormous

evil. And, even while many of these ideologies didn't triumph electorally--

-as happened in some European countries---the effect of their rise has been

that majority of the center parties have moved several inches toward social

intolerance, as a way of capturing a portion of the electorate attracted by

those polarizing discourses.

| am troubled that the author introduced this kind of explicit, inflammatory,

and highly subjective political bias which compromises the veracity of the

rest of the article. Even a cursory survey of modern world history

demonstrates that authoritarian regimes, the ostensible subject of the paper,

do not arise from one particular political ideology as the author asserts. Are

there no far-left regimes that trouble the author? Are far-right governments

the only ones that are anti-free speech or anti-ecological? Are there no far-

left regimes to be found with a tinge of authoritarianism, or is the author

simply untroubled by far-left tactics? Both here and in his earlier “even

more personal vignette” on p. 13, the author reveals a significant bias that

would seriously undermine any attempt at analysis (if there had been any

actual scientific analysis) in the article. | hope I am not the only reader who

thinks that Dr. Sluzki’s article is unfitting content for a journal committed

to scientific discourse instead of sensationalistic political opinions.

Washington Academy of Sciences

Response: Carlos E. Sluzki, MD, Fellow, WAS

I truly appreciate Dr. Bullard’s comments: Criticism is more generous and

constructive than silence!

Dr. Bullard is right in his first point: I could, and perhaps should, have

omitted the words ‘right wing” from my article (or at least added a footnote

making “also left-wing dictatorships” explicit). It may have then avoided

the assumption that, by focusing on right-wing hegemonies, | was

condoning left-wing ones. I do not. In fact, I agree that, to a greater or lesser

degree, the over-inclusive epithets “ethnic-nationalist, racist, sexist, anti-

immigrant, anti-ecological, anti-free speech, post-fact, authoritarian” may

describe traits of both ends of the political spectrum. While not justifying

my omission, | explain it by the fact that during this past few years there

hasn’t been, to my knowledge, any upsurge of left-wing political extremism

that fit those attributes (with the possible exception of the political scenario

of a couple of former USSR republics, and a few governments in the process

of collapsing, such as Venezuela). In contrast, there has been a notable

expansion of right-wing! populism? both in the Americas (the U.S.A.,

Brazil) and in Europe (noticeably Austria, Belgium, Denmark, France, Italy,

Norway, Switzerland,’ and in a more extreme fashion, Hungary and

Poland.) Not all of these movements are in control of their country’s

government —the exception being the last two mentioned—but they have

grown remarkably, and dangerously (bringing once again into this discourse

an opinion, albeit fed by the lessons of history, and shared by many (e.g.,

Wodak, 2015.)

The other issue bought forth by Dr. Bullard, namely, whether a scientific

journal should tolerate opinions, is another matter. It echoes a spurious

| Right-wing: Defined as an ideology that accepts and supports a system of social

hierarchy or social inequality, with a strong anti-immigrant rhetoric and, broadly

speaking, supporting curtail of the role of the state, and supporting a neoliberal economy.

Carlisle, R.P. (2005) Encyclopedia of politics: the left and the right, Volume 2.

University of Michigan; Sage Reference. p.693 &721

2 Populism is described by the Cambridge Dictionary as ‘political ideas and activities that

are intended to get the support of ordinary people by giving them what they want’. It

includes the usage label ‘mainly disapproving’. https://www.cam.ac.uk ‘news/populism-

revealed-as-2017-w ord-of-the-year-by-cambridge-university-press .

3 Datasets: Austrian Legislative Election; Swiss Federal Election, 2011; Norwegian

Parliamentary Election, 2013; Belgian Federal Election, 2011; Danish General

Election,2011. In European Election Database. Web 6 Nov.2013. &

https://en.wikipedia.org/wiki/2018 Italian general election

Spring 2018

by]

territorial dispute about the legitimacy of the use of the label “science’

between mathematic-based “hard” and socio-behavioral (and philosophy,

and history, and...) “softer” disciplines, and between science and the

common language (see, e.g., Bertrand Russell, 1958.) Should we erect tall,

beautiful walls between scientific fields, arguing the impurity or

dangerousness of our neighbors, or assume that there are gray zones

between provinces of the field of sciences where rigor and imagination

combine in fuzzy ways, to everybody’s benefit? “If scientific values

recognize a plurality of perspective, freedom of expression and political

negotiation beyond the alliances of the powerful, they would fit with the

values of liberal democracy. But the banner of ‘scientific values’ could

equally be raised by an authoritarian technocracy, in which tacit and

indigenous knowledge is marginalized.” (Hulme, 2009, p.702)

Science is not “out there,” untouched by the values of the scientist and

his/her times. Scientific journals can and should have values visible in their

pages.

REFERENCES

Russell, B. (1958): “The Divorce between Science and ‘Culture’”. An address

delivered on receiving the Kalinga Prize for the Popularization of Science at

UNESCO Headquarters on 28 January 1958. Transcribed in The UNESCO

Courier Dec.2001, p.33.

Hulme, M (2009): What does applying 'scientific values' mean in reality?

Nature 458:702

Wodak, R. (2015): The politics of fear: What right-wing populist discourses

mean. London & Los Angeles, Sage

Washington Academy of Sciences

AAAS Building

1200 New York Ave NVI

Suite GL117 Comments from

Washington, DC 200 the Vice

202-326-8975 President for

Terry Longstreth A ministrative

VP Administration

Affairs

adm in@washacadsci.org

In January of 2016, I became Vice President for Administrative Affairs

for the Washington Academy of Sciences. I spent the first year of my

incumbency trying to learn the job and to understand how I would fit

into the operational framework of the academy. Although I started with a

review of the WAS Bylaws, the Academy has a longstanding annual

business cycle and I was inserted into it at about its midpoint. So while I

studied the Bylaws, I had to, perforce, keep the wheels turning (while I

pumped up the tires, so to speak).

To paraphrase Article | of the Bylaws, the two primary purposes of the

Academy are to

e promote the interests of science (small ‘s’, i.e., not the magazine,

although we are grateful to the AAAS and that magazine for

their support) in Washington D.C. and its environs, and

e to provide for information sharing and cooperative activities

among the members and affiliated societies of the Academy.

Both purposes are only indirectly influenced by our current

operational environment, as reflected in the tools and procedures (and

the people, volunteers all) we have at our disposal for orienting the WAS

to achieve the goals implied by those purposes.

Furthermore, it has become clear over time that the operational

environment is anachronistic and not particularly responsive to changes

in the Washington science community. I’m in no position to direct or

steer the WAS (and it’s not my job to do so), but I do hope to make the

WAS Administration and its actions more visible to DC science in

general as well as to our membership and affiliates. In the process,

Winter 2018

perhaps the WAS will become more engaged and engaging to the

science related organizations and individuals in our local

MetropolitanStatisticalArea.|

To get started toward this goal, and as titular operations director both

of the Academy and of the Journal of the WAS, it seemed appropriate

that I try my hand writing a column for the Journal. My current intention

is to do this every quarter, but there’s no telling how that intention may

swerve over time. Other options are manifold: this space could be used

for guest essays, or perhaps other members of the Board of Managers

(“the BOM”) will offer contributions. Certainly, the Journal itself would

welcome offerings from the WAS membership. Ultimately this may lead

nowhere, but I hope not.

My job is described in the Bylaws under Article III. To summarize

that Article:

The Admin VP

e is 3 in rank in the Board of Managers, after the President and

President-elect, and presides over Board of Managers meetings

when the President and President-elect are unavailable;

e manages the business office and is responsible for business

operations of the Academy and the Journal;,.

e oversees the Office Manager and Editorial Advisory Committee;

e absent someone specifically appointed to the role, acts as

Archivist to maintain the historical records of the Academy.

That’s pretty much what the Bylaws say, and the BOM tries to

follow those rules. Overtime, the world has changed since these bylaws

were written and there are some rather obvious problems with my list

above.

1. We don’t have an(other) office manager. For the nonce, I’m it.

2. Similarly, we don’t have an Editorial Advisory committee. Our

Journal Editor, Sethanne Howard, advises herself (and does a

' https://www.bls.gov/regions/mid-atlantic/data/xg-tables/ro3 x95 12.htm for the

Bureau of Labor Statistics.

Washington Academy of Sciences

1S)

fantastic job). Moreover, if we had an EA committee, I’m not

sure what purpose it would serve, except to give me something

else to do. However, the committee would be useful as a backup

pool of editorial assistants in the event of a resurgence of Journal

submissions.

3. Academy and Journal Operations aren’t actually tied that closely

together except through their respective finances, which are

coordinated between me and the Treasurer.

The Office Manager’s role is primarily then that of an inward facing

office, with data management responsibilities for keeping track of the

business cycle (subscriptions and membership data) and, to a lesser

degree, the publishing cycle (e.g. receiving and retaining Journal

overprints).

Outward facing data dissemination (and related data management)

responsibilities are carried out by several individuals. Currently, our

Journal Editor, Sethanne Howard, prepares the Quarterly editions of the

Journal of the WAS and produces our email based newsletters and

announcements. The Webmaster (a role currently filled by Paul

Arveson, who also serves as the VP of the Junior Academy), controls

our WEB content and administers the washacadsci.org email domain.

Finally, our social media presence is, at the moment, the responsibility

of our President Elect, Judy Staveley, with guidance and assistance from

Paul Arveson.

The Journal Editor is acknowledged in the Bylaws, but there is no

mention of the Social media, Email administration or Webmaster roles.

Since all of these duties and _ responsibilities are generally

undocumented, the person in each role must depend upon word of mouth

(from anonymous sources, mostly old timers and former officers of the

BOM) and ‘what feels right’. Ultimately, they must decide for

themselves how to discharge those duties and meet their responsibilities.

So, where is all of this heading? This year, the Washington Academy

of Sciences is 120 years old. As it has aged, it has also evolved and must

continue to do so. It’s old news that the worldwide adoption of

electronic technologies means that disruptive changes to enterprise

Winter 2018

business models have challenged organizations of all sizes and

intentions. Our business cycle (the annual cycle of the Academy) begins

each May with the turnover of the new BOM, led by a new President

and President Elect. Each of the other officers may remain in office

indefinitely, subject to their continuing to appear on the annual ballot.

It’s that property of incumbency that allows the presiding officers to be

replaced without sacrificing continuity of knowledge and understanding

of the Academy’s processes.

Establishing and documenting how the Academy can deal with our

changing world is a responsibility we all share. As Admin VP I am

responsible for coordinating the data and office management processes

of the Academy and for projecting how those processes are documented

and shared with the Academy membership. I must also be a collector of

insights into the changes the Academy must undergo to remain relevant

and useful for the DC science community. I find that the one day a week

that I can support this office doesn’t allow much time for an Enterprise

Architecture effort. Such an effort would, I believe, be the expected,

contemporary strategy for an organization to address issues of

transformation in the face of disruptive change. So, I invite all readers of

this Journal in the DC area with free time to travel downtown to

correspond with me about supporting either the Office Management or

Enterprise Architecture efforts. I welcome any suggestions from anyone

as to how best to deal with the situation I’ve described (or provide a

better understanding of the current status of the WAS). My email is:

admin(@washacadscl.org

The preceding summary has focused on the current work of the

Administration function as it relates to Office Management functions.

I’ve not said anything about the Archive responsibilities. As a member

of an ISO committee responsible for Digital Archive standards (ISO

14721, and related ISO 16363 and 16919) it’s embarrassing that I’ve not

spent more time on this aspect of the Admin job. My only excuse is that

my ISO focus area (Digital Archives) doesn’t really include the WAS

archives, which are mostly hardcopy. However, if anyone out there has

access to a system to convert paper documents to PDF files, I’d like to

talk to you, too.

Washington Academy of Sciences

Over the course of the next year, I plan to write more about the how

the Admin functions are executed and how they support and complement

the activities of the WAS.

Terry Longstreth (AKA Wallace Isaac Longstreth, III)

Winter 2018

Washington Academy of Sciences

Book Review

Accessory to War: The Unspoken Alliance between

Astrophysics and the Military

Authors:

Neil de Grasse Tyson and Avis Lang

Norton, 2018 ISBN 978-0-393-6-06444-5

The popular conception of astronomy and astrophysics is as an “ivory

tower” pursuit. The further understanding of the universe and processes in

it are considered as exploration for its own sake. The authors show that this

is not the case at all. Here is a summary of some areas which are described

in much more detail in the book.

By going back to the pretelescopic era of naked eye astronomy, the

book describes how, in the royal courts of Middle Ages Europe, the

astronomer was also an astrologer. Horoscopes were cast to determine the

proper dates and times for multiple activities, including war. Accurate

predictions of the planets, Sun, and Moon were crucial in casting

horoscopes. The development of improved planetary predictions by famous

astronomers such as Copernicus, Tycho Brahe, and Kepler had at their root

the practical motivation to cast better horoscopes. This role independently

originated in separate ancient civilizations such as China, India, and even

the Mayan civilization of Central America. Astronomers separated

themselves from astrology as it became clear that the stars and planets were

so far away as to have little effect on Earth-bound life. One quibble this

reviewer has is that the physical reason for this abandonment was not clearly

explained in the book. The Sun and Moon are exceptions to this lack via the

non-astrological tides and the seasons. One characteristic of science is

“reproducibility”. Venus in European astrology was the goddess of love

while the Mayans thought of it as a terrible god of war, completely different.

Beyond astrology, astronomy played a crucial role in the age of

colonial empires, from the Renaissance to the 20" century. Columbus’

application of the discovery of the spherical shape of the Earth to the

conquest of the New World is well described in the book. But Columbus

was not the first. The spherical shape was well known to the ancient Greeks

and used in determination of latitude angle from the equator even by the

Winter 2018

Vikings via angles of stars and the sun above the horizon. The hard problem

was determination of longitude solved in the 18'" century by the

development of accurate ship-board clocks. Local time found by

astronomical observations was compared to the time at, say, Greenwich

England as preserved by the clock. It was now possible to discover the

locations of new lands to settle or conquer and return home afterward. Soon

there were observatories in every major port to study the stars and, more

importantly, accurately determine time. Perhaps this gave astronomers

alternative employment to casting horoscopes! The book has an excellent

description of the other indispensable tool of sailors, the compass. World-

wide observations of deviations of the compass from true north were made

by astronomers such as Edmund Halley. The book makes clear that for

better or worse (trade of new products or the slave trade), astronomy played

a crucial role in the age of sea-based empires.

Then turning to the telescope, one thinks of great astronomical

discoveries such as Galileo’s first great observations of craters on the Moon

or the satellites of Jupiter. However, this book makes a detailed case that

the telescope was first seen as an instrument for war from the very start.

Galileo himself promoted its use to identify, for example, distant enemy

ships. Beyond simple optical telescopes, a crescendo is described of

refinements resulting in instruments today that would be unrecognizable as

telescopes to an old fashioned optical astronomer. Today telescopes use not

only the invisible electromagnetic wavelengths of UV, infrared, and radio,

but even gravitational waves from merging black holes and neutrinos from

the cores of active galaxies.

The latter part of the book is a very detailed description of what I

shall call the modern day weaponization of space. Expenditures for these

hidden activities are very large compared to the much better known

scientific explorations. Thankfully, the 1963 Limited Nuclear Test Ban

Treaty has led to the exclusion of nuclear weapons in space thus far. Related

to the Test Ban Treaty, in an interesting transfer from the military to

astronomy, a military satellite detected gamma rays thought to be from

treaty breaking nuclear tests. There was great concern until astronomers

were able to verify that the rays were not from the Earth but from other

galaxies. Thus was born a new area of astronomy.

Washington Academy of Sciences

Despite the Test Ban Treaty, other types of non-nuclear hostile

weapons such as hunter killer satellites have been tested.by the United

States and China and are now being developed by other countries. The

recent talk of a United States space force is merely a combination under one

command of many already existing efforts. Today, a worrisome impact

threat to orbiting satellites is “space junk”: debris from exploded satellites

or other space activities. Today, we are so dependent on satellites from

communication to GPS navigation that a flood of debris and attacks from

even a non-nuclear space war would, in the words of the book’s authors,

“be terrifying.” Today, worry about these terrible effects has resulted in a

stalemate of sorts. In addition to diplomacy, the authors hope that education

and better scientific understanding may avert a terrible future.

As a counter point to this book’s theme, recent astronomical

research has revealed beauty and a story which the general public seems to

value for itself with no military benefit. An example of the beauty revealed

is the famous “Pillars of Creation” photograph by the Hubble Space

Telescope of glowing gas clouds and forming stars beautiful even to those

who do not know what is happening in the photo. As a result of such images

a successful campaign was launched to keep the Space Telescope in

operation. Another scientific trend of no foreseeable military benefit are

revelations that there is a story connecting us personally to a sequence of

events reaching to the origin of the universe. For example, the iron in our

blood was created in exploding supernovas, and the hydrogen in the water

in our blood was created in the Big Bang itself.

In a final note, this book is very detailed with footnotes making up

a significant portion of the book. Probably it should be read in smaller

chapter-by-chapter doses rather than straight through. There is a trap (into

which this reviewer has fallen personally) of having extensive knowledge

of a subject which is all presented in an overwhelming manner. Although

one of the co-authors is an editor, this book needed a good editor to create

a version emphasizing the most important facets in a more digestible form

for the lay reader. I would recommend this book to a layperson who is

already well read in astronomy or space science.

Gene G. Byrd, Professor Emeritus, University of Alabama

Winter 2018

Washington Academy of Sciences

Analyzing the Accuracy and Effectiveness of the EVI-

PAQ Trajectory Laser to Determine Bullet Trajectory

Reconstruction

Lydia Chance

Frederick Community College

Abstract

By following the written protocol on the use of a trajectory laser pointer,

we weighed its benefits when applying to a crime scene investigation. We

followed the standard protocol listed in the EVI-PAQ Trajectory kit and

demonstrated finding the angles of trajectory of blood droplets and/or

bullet holes. There are several methods to find an angle of trajectory, such

as using protrusion rods or a string alongside a protractor. Evidence

gathered is only as useful as the photographs taken to document it.

Therefore, ensuring that all photos taken are clear is a necessity. This

experiment in recreating a crime scene emphasizes the usefulness of the

trajectory laser and provide an in-depth review on its use in criminal

justice settings.

Introduction

ALTHOUGH THERE ARE ACCURATE WAYS to reconstruct the pathway a

bullet took through the air upon firing, I demonstrate the accuracy and easy

maneuverability of modern equipment such as a trajectory laser over the use

of string and protrusion rods alone. Implementing this modern method of

visualization in a crime scene recreation provides an invaluable experience

by showing the precision of the current techniques being used by crime

scene investigators of today. Using the EVI-PAQ Trajectory Kit one can

determine the point of origin of a shooter based on the angle of a bullet hole.

This laser kit contains several methods to determine the angle at which a

bullet struck a surface.

The reconstruction of bullet trajectory is often the last step in

recreating a crime scene, but that does not make it any less important than

collecting other forms of evidence. There are crime scene investigators who

work specifically in ballistics and specialize in interpreting the data

gathered by the trajectories and then speculating where the shot originated.

Often, the bullet trajectory will tell where a shooter was standing, and it is

Winter 2018

“x

reliable within the first 50 yards of travel for the bullet without having to

account for other variables such as gravity, air resistance, and yaw.

This evaluation consists of three phases: setting up the trajectory

laser using the components of the EVI-PAQ kit; demonstrating the ease of

use; applying the equipment to determine the trajectory of a bullet hole.

Methods

1. Acquire the Materials

1.1. The EVI-PAQ Kit mandates the use of the following materials and

equipment. The kit contains several methods for testing the

trajectory of a bullet hole, however, the laser will be used for this

experiment.

1.1.1. Trajectory rod kit with trajectory laser pointer

1.1.2. Protrusion rods

1.1.3. Protractor or angle finder

1.1.4. Reflective card

1.1.5. Camera with adjustable aperture and exposure

1.1.5.1. | One may require the use of photographic fog if the

area used for the laser is not dark enough or if there are

not enough small particles off which the laser could

reflect in midair.

1.1.6. Tripod

1.1.7. Wood board prepped with bullet holes

1.1.7.1. Bullet holes must be wide enough for the protrusion

rod; .22 caliber bullets may be too small

2. Photographing the Scene and Bullet Holes Before the Trajectory System

is Placed

2.1. Photographs of the scene must be taken before any obstruction

contaminates the crime scene.

2.2. Photographs of the bullet holes taken from each side with a scale

must be acquired.

2.2.1. Consider photographing the entire affected area if the bullet

holes are spread throughout on the same surface, then take the

close-up images with the scale.

2.2.2. Photographs should be taken from all angles.

2.2.2.1. Photograph the bullet holes from an angle

perpendicular to the hole, parallel to the surface on the

Washington Academy of Sciences

horizontal, and parallel to the surface from above if space

permits.

3. Preparing the EVI-PAQ Trajectory Laser and Protrusion Rods

3.1. The laser fastens to the end of the protrusion rod by screwing the

threaded end piece into the end of the rod. If necessary, one may

tighten a fastener to the laser and the rod to ensure stability.

3.2. Place the board with the bullet holes upright so that it is supported

on a flat surface.

3.2.1. The angle of the bullet hole in the board should lead the laser

to a point of origin that is non-reflective to avoid error in the

calculation of the angle.

3.3. Set the protrusion rod into the first bullet hole and push through

until the rod rests on the flat surface and its balance stabilizes.

3.4. Fasten the protrusion rod to the surface of the board if necessary.

3.5. Photograph the protrusion rod, after it is stabilized, from several

angles.

4. Lighting and Camera Settings

4.1. The lights in the room of the laser pointer must be shut off in order

to see it most clearly once turned on. Unless the scene is outdoors

and can be shot at night time, the lights must be off.

4.2. The camera should be prepared to take the photographs of the laser.

4.2.1. Use a long exposure to ensure the light of the laser is shown

clearly

4.2.1.1. An exposure may last up to three minutes to gather

the largest amount of light possible for clear photographs.

4.2.2. The tripod should be placed to capture the entirety of the

laser’s path.

4.2.2.1. The laser can project up to 5,000 feet but other

aspects of trajectory must be considered for any distance

greater than 50 yards and must be addressed during

calculations of an origin point.

4.2.3. The timer on the camera may be set to take the photograph

to avoid any shaking clicking down the capture button may

have caused.

4.2.3.1. Two seconds should give ample time for the camera

to steady itself on the tripod after being pressed.

Winter 2018

Turning on the Laser and Documenting Angles

5.1. Turn on the laser

5.2. Hit the capture button on the camera

5.3. Wait until the exposure stops before moving any aspect of the

Scene:

5.3.1. Moving the camera while the shutter is still capturing the

light will result in a blurred photograph containing a light trail

of the laser beam and will have to be redone.

5.4. Use a protractor to measure the angles

5.4.1. To produce an angle of trajectory, there must be two points

from which to measure. The entrance and exit can be used to

measure the angle of impact in thick materials.

5.4.1.1. | The bullet hole may also aid in producing the angle.

5.4.2. The laser will point to the third point necessary for

determining the point of origin or may pass through the origin

and continue on if the scene is large enough.

Figures | through 8 illustrate the various steps taken for documentation

and the methods applied to reconstruct the bullet trajectory.

Figure | The wood board is photographed from multiple angles with a scale

(white strip) to document the bullet holes.

Washington Academy of Sciences

Figure 2 The Prctiaciontre rod i is se merpengs in me rae hole ai the t fairey laser

attached to the end.

Figure 3 The ae of the Snir rod has a bullet tip mounted and it is placed

into the wood board to begin the angle recreation process.

Winter 2018

Figure 4 The camera is placed to capture the entire length of the laser's path.

Washington Academy of Sciences

. Government Veteran Owned, §

oS “Scientific Source of Lab Equipmen

x Scientific” $00.248.8030 fax 7

Be ees

Figure 5 The second cluster of holes had three entry and exit holes.

SSHONI

mene

Pad

ro ee

2 etH

and

&Q :

Figure 6 Bullet hole C was measured from multiple angles.

Winter 2018

Veteran Owned, Small Bu

Government

of Lab Equipment, Suppl

Scientific Source

“Everything Scientific”

800.248.8030 fax 703,734.18

Figure 7 The third grouping had two bullet holes.

JNUSINS

UMWIUI3A0F ae

}

Figure 8 The bullet holes within the third cluster were too small for the

protrusion rod to pass through and must have the angle of trajectory measured

with string and thinner protrusion rods than the ones in the kit.

Washington Academy of Sciences

Conclusion

By using a trajectory laser kit the point of origin is easier to

visualize, and the angle of trajectory can be measured. There was an

instance when the .22 ammunition hole did not allow the protrusion rod to

pass through; therefore, using the trajectory laser was not possible for this

example. Using a protrusion rod connected to the trajectory laser makes it

easier to measure the angle of trajectory from the wood and allows for clear

documentation of the angle with a protractor. Although the trajectory laser

is useful over long distances, it is often difficult to see outdoors or in well-

lit areas. The stability of the trajectory laser depends on the user holding the

camera button down and often results in wobbling as the exposure of the

camera starts. If the room is dark enough, the exposure of the camera can

be modified to let the correct amount of light in and still reflect the green of

the laser’s light through a white reflective card showing the position of the

laser in the scene. There is no way to determine exactly where the shooter

was standing because the laser will shoot from the endpoint of the bullet

hole to whichever hard surface it next comes in contact. Further speculation

allows crime scene analysts to determine the ultimate position of the shooter

by accounting for all the information gathered in the crime scene

reconstruction.

References

Saferstein, R. (2016). Forensic Science from the Crime Scene to the Crime

Lab. Hoboken, NJ: Pearson Education.

Staveley, J. (2015). An Introduction to Forensic Science BI 130

Laboratory Manual. Sagamore Beach, MA: Academx Publishing

Services, Inc.

Tomboc, Ricardo; personal communication. San Bernadino Police

Department Identification Bureau Identification Technician II

Winter 2018

Bio

Lydia Chance is a full-time student at Frederick Community College and

is currently majoring in general studies. She graduated from Middletown

High School and began attending classes at FCC in the fall of 2018. While

taking forensic biology as an honors course, Lydia was mentored by Dr.

Judy Staveley for her individual project featuring the use of a trajectory

laser kit. After earning an AA through the Honors College at FCC, she

plans to major in forensic psychology.

Washington Academy of Sciences

Reusable Models of Manufacturing Processes for

Discrete, Batch, and Continuous Production

Mahesh Mani', K.C. Morris!, Kevin W. Lyons’, William Z. Bernstein’

‘Allegheny Science and Technology, 2NIST

Abstract

This article explores the new ASTM E3012-16 International Standard Guide

for Characterizing Environmental Aspects of Manufacturing Processes, its

application and potential impact in the manufacturing industry. The standard

provides guidance for industries to examine unit manufacturing processes,

capture characteristics in terms of how they impact the environment, and

explore opportunities to be efficient and sustainable in their operations. The

standard further encourages formal representations for consistent and effective

deployment of manufacturing tools and reuse of data and information models

for automated analysis.

Introduction

TO REMAIN COMPETITIVE manufacturers today seek to improve

productivity while maintaining quality and meeting sustainability

objectives. With the manufacturing sector consuming a large percentage of

our national resources, smart manufacturing and sustainable manufacturing

implementations through process optimization hold tremendous potential

for improvement!*?:4, Being cognizant of the production improvement

opportunities is key to success. But where do we start? Starting at the

process level poses an opportunity — an opportunity to improve process

performance through the meticulous understanding of selected processes.

‘Mani, M., Madan, J., Lee, J. H., Lyons, K. W., & Gupta, S. K. (2013). Review on

Sustainability Characterization for Manufacturing Processes. National Institute of

Standards and Technology, Gaithersburg, MD, Report No. NISTIR, 7913

* Haapala, K.R., Zhao, F., Camelio, J., Sutherland, J.W., Skerlos, S.J., Dornfeld, D.A.,

Jawahir, I.S., Clarens, A.F. and Rickli, J.L., 2013. A review of engineering research in

sustainable manufacturing. Journal of Manufacturing Science and Engineering, 135(4),

p.041013

3 Stephan Mohr, Ken Somers, Steven Swartz, and Helga Vanthournout, Manufacturing

resource productivity, McKinsey Quarterly, June 2012.

4 https://itif.org/publications/2018/1 1/28/innovation-agenda-deep-decarbonization-

bridging-gaps-federal-energy-rdd

Winter 2018

Eventually, these individual opportunities can be harnessed at a systems

level where multiple manufacturing processes work in concert.

Characterization of process-level activities can empower better engineering

at higher levels of manufacturing automation and control. These control

levels are described in the widely-acknowledged enterprise to control

system hierarchy (ISA 95°). Besides this, the ISO 14000 family® of

environmental management standards are useful towards developing a

management approach to sustainability and retroactively comparing the

impacts of different comparable products. But, specific guidance for

manufacturers to characterize individual processes and _ identify

opportunities for improvement can be an added advantage. To provide such

guidance for industries to examine basic manufacturing processes (a.k.a.

unit manufacturing processes) ASTM International’ issued a set of

standards, including E2979-18°, E2986-18°, E2987-18'°, E3012-16!', and

E3096-18'*. These standard guidelines help manufacturers scrutinize and

capture the characteristics of individual processes in terms of how they

impact the environment, and look for opportunities to be more sustainable

in their operations.

This article specifically explores the new ASTM £301/2-16

International Standard Guide for Characterizing Environmental Aspects of

Manufacturing Processes'* and its consideration for use with discrete,

batch, and continuous production. The standard provides guidance for

industries to examine unit manufacturing processes, capture the

5 https://www.isa.org/isa95/

° https://www.iso.org/iso-14001-environmental-management.html

7https://www.astm.org/

8 ASTM International (2018). E2979-18: Standard Classification for Discarded Materials

from Manufacturing Facility and Associated Support Facilities.

° ASTM International (2018). E2986-18: Standard Guide for Evaluation of

Environmental Aspects of Sustainability of Manufacturing Processes.

'0 ASTM International (2018). E2987/E2987M-18: Standard Terminology for

Sustainable Manufacturing.

'! ASTM International (2016). E3012-16 Standard Guide for Characterizing

Environmental Aspects of Manufacturing Processes.

'2 ASTM International (2018). E3096-18 Standard Guide for Definition, Selection, and

Organization of Key Performance Indicators for Environmental Aspects of

Manufacturing Processes

'3 https://www.astm.org/E3012-16.htm

Washington Academy of Sciences

characteristics of those processes in terms of how they impact the

environment, and look for opportunities to be more sustainable in their

operations and improve their efficiency. The standard also encourages

standard representations for consistent and effective deployment of

manufacturing tools and reuse of data and information models.

Current Gaps and Potential for Standards

Several workshops '*>!> facilitated by the National Institute of

Standards and Technology (NIST)? across the U.S. have reiterated the

viewpoint that gaps exist in terms of measurement capabilities to connect

sustainable manufacturing practices with the promotion of resource

efficiency. Today’s practices for sustainability-related analysis for products

do not explicitly account for individual manufacturing processes. Current

practices fall short in promoting a science-based understanding of

individual processes critical for their performance improvement and

decision making'®'’, Formal methods for collection and consolidation of

sustainability related information on manufacturing processes is lacking.

The measurement science—including methods for process

description, performance metrics, and a corresponding information base for

unit manufacturing processes—will allow for a more consistent evaluation

of sustainability performance across manufacturing systems. Providing the

science in the form of best practices is a goal for the ASTM International

standard.

14M. M. Smullin; K. R. Haapala; M. Mani; K.C. Morris. ‘Using industry focus groups

review to identify Challenges in sustainable assessment theory and practice.” ASME

International Design and Engineering Technical Conferences & Computers and

Information in Engineering Conference, Charlotte 2016

ISW.Z. Bernstein ef al., 2018. ‘Research directions for an open unit manufacturing

process repository: A collaborative vision,’ Manufacturing Letters, 15 (B), pp.71-75

16 M, Mani, Madan, J., Lee, J. H., Lyons, K. W., & Gupta, S. K. (2014). Sustainability

characterization for manufacturing processes. International Journal of Production Research,

52(20), 5895-5912.

17 Duflou, J.R., Sutherland, J.W., Dornfeld, D., Herrmann, C., Jeswiet, J., Kara, S., Hauschild, M.

and Kellens, K., 2012. Towards energy and resource efficient manufacturing: A processes and

systems approach. CIRP Annals-Manufacturing Technology, 61(2), pp.587-609

Winter 2018

ASTM International Standards on Sustainability

ASTM International is a global leader in the development of

voluntary consensus standards. ASTM International formed the E60.13

Subcommittee on Sustainable Manufacturing to guide industry in best

practices to inform sustainability-related decisions. More information on

the standards published through this committee can be accessed from the

committee website ' . The E60.13 E3012-16 standard defines a

methodology to develop unit manufacturing process or UMP information

models. The standard contributes to the measurement science needed to

quantify sustainable manufacturing practices to the benefit industrial

competitiveness. Standard methods for describing the environmental

choices that a manufacturer makes allow them to improve their practices

and to differentiate themselves from the competition.

Application of the standard benefits manufacturing practices in two

ways. First, it raises consciousness about manufacturing processes, their

environmental impacts, and opportunities for their improvement. The goal

of applying the standard is to improve the environmental aspects of the

process through the definition of key performance indicators specific to an

individual process addressing potential enterprise level goals. Establishing

that rigor sets the stage for better informed decision-making and production

planning.

The new ASTM standard provides guidance to help manufacturers

effectively understand processes, capture process characteristics in terms of

decision making and, as a result, leads to more sustainable systems.

Secondly, the use of standard practices and formal representation methods

poises manufacturers for transition into scientific modeling environmental

impact, and identify opportunities for improvement. Characteristics of a

processes imply descriptions of what goes into and out of the process, how

the process transforms its inputs to outputs, and what types of information

is used in the transformation. The standard format defined in ASTM E3012-

16 provides a basis for ensuring that a specific set of details are defined and

that they are covered in a consistent manner. See Figure 1. In this way, the

standard offers a method to generate reusable constructs (UMP information

models) that provide a structured way of both understanding and specifying

18 https://www.astm.org/COMMIT/SUBCOMMIT/E6013.htm

Washington Academy of Sciences

unit manufacturing processes. Such constructs presented in an abstract and

precise manner can be parameterized and reused in different application

contexts like information processing, simulation, and analysis. The standard

makes for better comparisons, increased reuse, and, in the end, more reliable

results.

Physical World : Digital World

Product and Process Information

* Equipment and material specifications

* Process Specifications

Communication

* Setup-operation-teardown instructions

* Control Prograrns and process control

* Product and engineering specifications

«Part geometries

* Production plans i .

«Quality plans Optimization

* KPIs and quality plans

*PLM and sustainability plans

* Safety documentation

Transformation

* tregy

Outputs Simulation / Design

* Product of Experiments

«Waste

=Material &

consumables

Resources

* Outside factors

* Disturbance

*Equipment *Solid, liquid, emissions

* Tooling *Thermal, noise

* Fixtures

*Human Life Cycle

“Software Assessment

Graphical and formal representations

Figure. | Overview of the significance and use of this standard. UMPs store digital

representations of physical manufacturing assets and systems to enable engineering

analysis, e.g., optimization, simulation, and life cycle assessments.

Potential Impact

ASTM E3012-16 is a good starting point for creating reusable

descriptions of manufacturing processes that will ultimately realize process

analytics and tool integration. In addition to systematic characterizations of

processes, the formal representations for those characterizations support the

direct use of the information within a variety of applications. The most basic

application is to support effective communication by ensuring consistency

and completeness. More advanced applications include computational

analytics and comparison of performance information. The formal

information model described in the guide facilitates new software tool

development to link manufacturing information and analytics for

calculating environmental performance measures. Further, the standard

format paves the way for more specific software tools supporting the

development and extension of standardized data and information bases such

as Life Cycle Inventory (LCI). LCI data is extensively used in life cycle

assessments (LCA), part of the 14000 family of standards. The top down

Winter 2018

approach of the ISO 14000 family and the bottom up measurements

approach from ASTM standards are complimentary.

Formally defined UMP models can cater to different user

information from a variety of perspectives. For example, using the standard,

e a variety of stakeholders, e.g., plant managers, process engineers,

technicians and operators, can better understand and communicate

manufacturing processes through consistent and tailored views of

the model;

e manufacturing engineers can develop system models from the unit

manufacturing processes by linking them together to characterize

specific production plans for discrete batch or continuous

production;

e systems integrators can use models of manufacturing processes to

understand material and information flows, and

e manufacturers can capture their own data for LCA-based

environmental assessments by developing data sets representing the

environmental impacts of their unit processes, complimenting and

sharpening LCI data sets.

In a related work, the authors explored the use of the standard with

three use cases in the pulp and paper industry. The case studies showed the

utility of the draft standard as a guideline for composing data to characterize

manufacturing processes. The data, besides being useful for descriptive

purposes, was used in a simulation model to assess sustainability of the

manufacturing system.!??°

Scope of the Current Standard and Beyond

Leveraging unit process models is by no means a new idea to

continuous process industries, such as the Chemical Industry. For nearly a

century, mathematical representations of “unit operations,’ such as

'? Mani, M., Larborn, J., Johannson, B., Lyons, K., & Morris, KC. (2016). Standard

representations for sustainability characterization of industrial processes. Journal of

Manufacturing Science and Engineering

0 Rebouillat, L., Barletta, I., Johansson, B., Mani, M., Bernstein, W.Z., Morris, K.C. and

Lyons, K.W., 2016. Understanding sustainability data through unit manufacturing

process representations: a case study on stone production. Procedia CIRP, 57, pp.686-

691

Washington Academy of Sciences

filtration, evaporation, humidification and distillation, have been derived

for controlling both small-scale plants and industrial installations. 7!

Considering the longevity of the unit process-based approaches in Chemical

Engineering’, the authors envision its direct relevancy to the process

industry and beyond. The hope will be that the formal characterization of

UMPs across diverse industries would enhance existing analysis

frameworks, such as improving the precision of life cycle assessment, a

method that still is burdened with significant uncertainty.77ASTM E3012-

16 is designed to be relevant across different production types, including

discrete, batch, and continuous. The standard provides a fundamental

representation to support unit manufacturing process in all of these

production settings. Characterizing the bounds of each unit manufacturing

process drives insight into each process’s functional characteristics.

The current standard is a first step to facilitate studies of existing

processes and to make those studies more accessible in the future. It can

serve as the basis for the development of production system models to better

understand process flows and interactions between and across different

processes. A repository of UMP models can be used for planning both to

retrofit existing facilities or for new facilities. Designs for new facilities are

almost always based on prior experience with operating processes and

realistic models should prove useful especially for verification and

validation activities.

The perceived scientific benefits to manufacturers from application

of the standard include reduced operational costs, improved prediction of

product costs, improved schedule, maximization of manufacturing

resources, improved control of product quality, and incorporation of best

practices. Modeling individual manufacturing processes facilitates the

generation of quantifiable evidence that improvements are being made. The

standard provides a uniform and repeatable way for more practitioners to

reap these benefits.

21 Walker, W.H., Lewis, W.K. and McAdams, W.H., 1923. Principles of chemical

engineering. London: McGraw-Hill Publishing Co

22 Turton, R., Bailie, R.C., Whiting, W.B. and Shaeiwitz, J.A., 2008. Analysis, synthesis

and design of chemical processes. Pearson Education.

23 Jacquemin, L., Pontalier, P.Y. and Sablayrolles, C., 2012. Life cycle assessment (LCA)

applied to the process industry: a review. The International Journal of Life Cycle

Assessment, 17(8), pp. 1028-1041

Winter 2018

The standard will be of interest to software providers across

industries interested in providing analysis and modeling/simulation

solutions to manufacturers. The standard format promotes information

exchange and communication through digitalization of manufacturing

assets for decision making purposes. Moving forward, with contribution

from industries, future standards can encompass a broader set of processes

and functionalities using ASTM E3012-16 as a platform on which to build.

Further, the creation of a repository of models should reduce modeling time

and improve model verification and validation activities. The creation of a

repository of models also provides a forum for industries to come up with

best practices and target sets of UMP models for common processes as

reference data.”

Future Work and Conclusions

As a relatively new standard co-developed by supportive

manufacturers, the ASTM task group is now seeking more participation

from across industries, especially SMEs, to demonstrate and further

improve the standards. The standard has already received some attention

and efforts are underway to spread the word. Much of the vision for the

work will require further research and future standards based on real world

experience”. UMP-focused industrial case studies are of interest to the task

group. NIST has already hosted two competitions, and will host a third, to

apply the standard to existing process models.*° This resulted in a diverse

set of models and focused attention within the educational world. To realize

the promise of reusing such models and automating analytics and system

integration for manufacturing significant research challenges remain

including advancements in the following areas

e Knowledge and understanding of UMP modeling. This includes

novel formal representations and methodologies, more accurate or

specialized metric, metric representations that support cascading to

“4 W. Z. Bernstein; M. Mani; K. W. Lyons; K.C. Morris; B. Johansson. ‘An Open Web-

Based Repository for Capturing Manufacturing Process Information.” ASME

International and Design and Engineering Technical Conferences & Computers and

Information in Engineering Conference, Charlotte 2016

*W.Z. Bernstein et al., 2018. ‘Research directions for an open unit manufacturing

process repository: A collaborative vision,’ Manufacturing Letters, 15 (B), pp.71-75

6 https://www.nist.gov/news-events/events/2018/01/ramp-reusable-abstractions-

manufacturing-processes

Washington Academy of Sciences

higher production levels, or exploration of variations for families of

UMP models.

Standards supporting models reuse. This includes automated

methods that allow linking of UMP models into systems, facilitating

system composition through naming conventions or other methods,

generalization that unifies a collection of processes, or standards-based

methods for integration with applications.

Techniques for development and validation of UMP models. This

includes demonstration of validation techniques for the effectiveness

and accuracy of the UMP models or techniques for producing useful

derivatives of UMP models or creative methods for mining

documentary model descriptions into formal representations.

As more groups apply the standard in their domains, the shared

experience will provide a basis on which to further understand

standardization needs and opportunities. Formal methods for acquiring and

exchanging information about manufacturing processes will lead to

consistent characterizations and help establish a collection for reusable

models. Standardized methods will ensure effective communication of

computational analytics and sharing of sustainability performance data.

NIST is also looking for manufacturers to collaborate on pre-pilot projects

to contribute to the collection of use cases for the standard. In conclusion,

the use of a reusable standard format should result in models suitable for

automated inclusion in a system analysis, such as a system simulation model

or an optimization program

Winter 2018

Bios

Mahesh Mani is a Senior Technology Adviser with Allegheny Science and

Technology supporting the Advanced Manufacturing Office of the

Department of Energy. His research interests include smart, sustainable and

additive manufacturing.

KC Morris leads a group at the National Institute of Standards and

Technology focused on standards to infuse smart technologies into the

manufacturing sector while ensuring that new practices lead to more

competitive and sustainable manufacturing. Currently, KC is on detail to

the US House of Representatives serving as an ASME Manufacturing

Fellow.

Kevin W. Lyons recently retired from the National Institute of Standards

and Technology. His research interests include sustainable manufacturing,

nano manufacturing, design, process modeling, assembly, virtual assembly,

and additive manufacturing technologies.

William Z. Bernstein is a research engineer at the National Institute of

Standards and Technology. Dr. Bernstein currently leads the Product

Lifecycle Data Exploration and Visualization project. His research interests

include advanced visualization, information modeling, and sustainable

manufacturing.

Washington Academy of Sciences

Generating Domain Terminologies using Root- and

Rule-Based Terms’

Jacob Collard', T. N. Bhat”, Eswaran Subrahmanian>*, Ram D. Sriram?,

John T. Elliot*, Ursula R. Kattner’, Carelyn E. Campbell’, Ira Monarch4

Independent Consultant, Ithaca, New York',

Materials Measurement Laboratory, National Institute of Standards and Technology,

Gaithersburg, MD”,

Information technology Laboratory, National Institute of Standards and Technology,

Gaithersburg, MD °,

Carnegie Mellon University, Pittsburgh, PA‘,

Independent Consultant, Pittsburgh, PA®

Abstract

Motivated by the need for flexible, intuitive, reusable, and normalized

terminology for guiding search and building ontologies, we present a general

approach for generating sets of such terminologies from natural language

documents. The terms that this approach generates are root- and rule-based terms,

generated by a series of rules designed to be flexible, to evolve, and, perhaps most

important, to protect against ambiguity and standardize semantically similar but

syntactically distinct phrases to a normal form. This approach combines several

linguistic and computational methods that can be automated with the help of

training sets to quickly and consistently extract normalized terms. We discuss how

this can be extended as natural language technologies improve, and how the

strategy applies to common use-cases such as search, document entry and

archiving, and identifying, tracking, and predicting scientific and technological

trends.

1. Introduction

1.1 Terminologies and Semantic Technologies

SERVICES AND APPLICATIONS ON THE WORLD-WIDE WEB, as well as

standards defined by the World Wide Web Consortium (W3C), the primary

standards organization for the web, have been integrating semantic

technologies into the Internet since 2001 (Koivunen and Miller 2001).

These technologies have the goal of improving the interoperability of

| Commercial products are identified in this article to adequately specify the material.

This does not imply recommendation or endorsement by the National Institute of

Standards and Technology, nor does it imply the materials identified are necessarily the

best available for the purpose. |

Special thanks to Sarala Padi for assistance in compiling and presenting this document

Winter 2018

applications on the rapidly growing Internet and creating a comprehensive

network of data that goes beyond the unstructured documents that made up

previous generations of the web (Berners-Lee, Hendler, and Lassila 2001;

Feigenbaum et al. 2007; Swartz 2013). These semantic technologies can be

used to protect against ambiguity and reduce semantically similar but

syntactically distinct phrases to normalized forms. Semantically normalized

forms allow users to more easily interact with data and developers to reuse

data across applications. However, most of these technologies rely on data

that have been annotated with semantic information. Data that were not

designed for use on the semantic web most likely do not include this

information and are therefore more difficult to integrate. For example,

scientific research papers are typically text- and graphics-based documents

designed to be read and processed by humans. These documents do not

usually contain semantic markup, meaning that search engines may not be

able to take advantage of such advances in data technologies.

Another major issue in semantic computing is the representation of

domain-specific semantics. Different scientific and academic disciplines, as

well as other spheres of communication such as conversation, business

interactions, and literature, all have overlapping vocabulary. However, the

same words often have different meanings depending on the domain. In the

sciences, each field (and often each subfield) has its own terminology that

is not used in other disciplines, or that conflicts with the language of more

general-purpose communication. For example, in general use, the word

fluid typically includes liquids, but not gases. However, in physics, gas and

liquid are both hyponyms of fluid. Semantic technologies may assume that

two annotations with the same name have the same semantics, when this is

not necessarily the case.

Generally speaking, this issue stems from the problem of

coordination described by Clark and Wilkes-Gibbs (1986). Any participant

in a system of communication is typically missing some of the knowledge

held by other participants. Because of this knowledge gap, participants may

not understand one another unless they are using a shared knowledge system

and a shared vocabulary. Clark and Wilkes-Gibbs (1986) describe how

speakers establish a common ground, collaborating to ensure that

participants know one another’s strengths and limitations. Interactions

involving computers are also systems of communication, and must also

coordinate in order to ensure that all applications are communicating

properly. Establishing common ground across fields is supported by

Washington Academy of Sciences

standardizing terminology and having hyponymic and other semantic

relations structured for use by humans and machines.

Issues of coordination are relevant in many fields, particularly when

it comes to data re-use and interoperability. For example, The Minerals,

Metals, & Materials Society (TMS) describes many gaps and limitations in

current materials science standards; one of these gaps is an “insufficient

number of open data repositories,” referring to repositories containing data

that can be used by many applications, with the stipulation that data not only

be available, but also be re-usable (“Modeling Across Scales” 2015). This

is impossible without some means of coordination and standardization of

terminology. TMS recommends developing initiatives to aid in

coordination, for example by engaging “a multidisciplinary group of

researchers to define terminology and build bridges across disciplines.”

Multidisciplinary coordination is a necessary part of improving data

reusability, but support of such coordination is lacking.

Our goal in this paper is to describe a general system that is capable

of automatically creating standardized terminologies that will be useful for

developing domain ontologies and other data structures to fill this gap. A

domain ontology is a collection of concepts (represented as terms) and

relations between them that correspond to knowledge about a particular

family of topics. Our system will take into account potential issues in

terminology generation, including the disambiguation and normalization of

terms in a domain. In many cases, a single term can be expressed in many

different ways in natural language. For example, in mathematics the phrase

“without loss of generality” has a technical meaning; however, a researcher

might also write “without any loss of generality” or “without losing

generality.” All three variants refer to the same technical phrase and should

be treated together in a standard terminology. Terminologies also face

issues relating to polysemy, syntactic ambiguity, context sensitivity, and

noisy data.

The key component of our system is a representation of terminology

that takes advantage of the compositional nature of natural language

semantics by converting natural language phrases into consistently

structured terms. This representation overcomes issues of syntactic

variation by normalizing different syntactic structures based on their

compositional semantics. That is, we represent synonymous phrases in the

same way despite differences in surface realization. To help ensure

consistent terminology generation, our system uses a set of rules to restrict

and guide the formation of these normalized terms. Because this system is

Winter 2018

based on a modular set of rules that combine smaller components of phrases

into standardized terms, we refer to it as a root- and rule-based method

for terminology generation.

Our root- and rule-based approach is motivated more by linguistic

than statistical models. This is necessary for the construction of rule-based

terms, which are dependent on consistent structures and a representation of

the underlying linguistic form. Our rule-based model ultimately relies on

the way that words come together syntactically in order to form phrases.

The meanings of these phrases are, in most cases, related to the meanings

of their component parts (i.e. the individual words). The way that words

compose to form more complex meanings is detailed in research such as

Montague (1988), though the underlying principles ultimately extend back

to Frege (1884). Through an understanding of syntax as modeled in

linguistics, it is possible to formalize the compositionality and therefore

normalize synonymous phrases, despite significant differences in form.

Key Phrases

Key Phrase Extractor

Tethered Root Generator Super Root Generator

Term Generator

Figure 1. Terminology Generation

Within the context of our root- and rule-based system, a term is a

representation of a concept within a domain (and may cover a number of

words and/or phrases); a collection of terms describing the same domain

make up a terminology. Our system also defines roots, which are smaller

components which come together to make up terms — that is, a single term

is made up of one or more roots. A terminology is distinct from an ontology

in that an ontology additionally defines the relationships between concepts,

though the concepts in an ontology are typically represented by terms of

some sort. A rule is a codified process used to generate, restrict, or

normalize terms in our root- and rule-based approach. This paper will

Washington Academy of Sciences

describe the linguistic and theoretical motivations behind these rules, which

are introduced in Bhat ef al. (2015) as specifically applied to materials

science. In Section 2, we describe the theory behind the syntactic

normalization that allows for the automatic construction of root- and rule-

based terms. This is followed by Section 3, where we describe how to use

features of natural language syntax in combination with additional rules to

create root- and rule-based terms. We then describe how terms can be

extracted from natural language texts in Section 4. One of the main features

of the root- and rule-based approach is that it is easily extensible and

adaptable to new and different use-cases, as we discuss in Section 5. Lastly,

we describe how our system ties in with the challenge described above of

creating terminologies that are robust to the complexities of natural

language and to the needs of users in Section 6.

1.2 Use-Cases and Architecture

In developing this strategy, we have considered four very general

use-cases, each of which corresponds with an interface that allows users and

administrators to interact with the terminology generation system. These

will be discussed in greater detail in Section 6. We also discuss ontology

generation as an extension of terminologies generated from the root- and

rule-based method.

* Document Entry: Users should be able to upload documents, have

terms extracted from these documents, make changes to the suggested

terms, and have the terms added to the terminology in the database.

5 Document Retrieval: Users should be able to construct a query and

receive a list of documents matching the terms in the query.

5 Curation: Curators (who may be dedicated administrators or user

volunteers in crowd-sourced systems) should be able to make changes

to the terminology.

° Rule Changes: Curators should be able to make changes to the way

the system generates terms, as technologies change. Changes also

reflect the way that people are using the terminology and systems that

have become de facto standards.

. Ontology Generation: Users should be able to use a set of terms as

the basis for a domain ontology, which extends a terminology by

providing additional semantic relationships between terms.

Winter 2018

With these in mind, we have outlined a root- and rule-based terminology

system in Figure 1. This figure shows the general process which will be

explained in this paper. Through this process, a set of rules are used to

extract a series of key phrases from a corpus and convert them into root-

based terms (three types of roots are described in this paper: roots, tethered

roots, and super-roots; see Section 3).

2. Theory

Generating a terminology requires identifying salient concepts in a

corpus and constructing representations of those concepts. There are many

different approaches to identifying salient concepts, that is, to find words

and/or phrases in the text that stand out. In many cases, the identification of

salient concepts produces a list of words and phrases taken directly from the

text. A terminology generation system then needs to convert words and

phrases into a format which enhances the potential for humans and

machines to use the terminology for various practical applications. As with

key phrase extraction, researchers and developers have used a variety of

methods to convert words and phrases into terms representative of key

concepts (Witschel 2005).

2.1 Previous Research

Key phrase selection is a major area of study in the field of

information extraction (Witschel 2005). Many methods of key phrase

extraction rely on two components: a unithood metric and a termhood

metric. A unithood metric determines the particular types of words and

phrases that qualify as key phrase candidates. Units found by a unithood

metric may or may not be relevant enough to qualify as key phrases, but can

be used to restrict the set of words that must be compared for relevance. For

example, a unithood metric may identify all noun phrases in a corpus, so

that only noun phrases are considered as potential terms. Not all of the noun

phrases selected will become terms, but only noun phrases will be extracted.

More complex unithood metrics are also possible. Frantzi, Ananiadou, and

Tsujii (1998), for example, consider the following unithood metrics in the

evaluation of their C-Value and NC- Value algorithms, which are algorithms

for key phrase extraction’:

* Tn this representation Noun, Adj, and Prep are patterns matching parts of speech (noun,

adjective, and preposition, respectively). Parentheses group patterns together, and two

patterns separated by a pipe (j) produce a new pattern which matches either of its

components. The asterisk (*) produces a pattern that matches the previous expression

zero or more times. The plus sign (+) is similar, but matches the previous expression

Washington Academy of Sciences

. Noun’ Noun

. (Adj|Noun)’ Noun

° ((Adj/Noun)'|((AdjjNoun)‘(Noun Prep)’)(Adj/Noun)")Noun

A typical automatic term recognition algorithm may then identify

which of the selected units are relevant using a termhood metric. A

termhood metric may be based on statistical or linguistic features; Frantzi,

Ananiadou, and Tsujii (1998) use word frequency to identify nested terms

(candidates which occur within other candidates) and the surrounding

context of a term. These are used together with a mathematical formula to

assign a score to each candidate. In this way, they are able to select for

particular types of terms. The features used by Frantzi, Ananiadou, and

Tsujii (1998) are by no means exhaustive; Proux ef al. (1998) and

Rindflesch, Hunter, and Aronson (1999), for example, make use of

linguistic information such as part-of-speech tags to improve their termhood

metric.

We do not present any new methods for automatic term recognition,

nor do we make any judgment as to the “best” contemporary method.

However, because the model of term generation and normalization that we

describe is dependent on automatic term recognition, we do discuss it

briefly. In theory, our system can be used with any automatic term

recognition algorithm, though multi-word terms, such as those recognized

by the C/NC-Value Algorithm (Frantzi, Ananiadou, and Tsujii 1998) are

optimal for the root- and rule-based method as hierarchical relationships can

be inferred from these complex terms.

Once terms have been selected, we generate normalized concepts

rather than using natural language phrases. Natural language phrases have

many disadvantages, including ambiguity and synonymy, as discussed

previously. One method of normalizing concepts is to automatically group

words into clusters, as in Liu et al. (2012). For example, the phrases

“monthly expense,” “personal insurance product,” “core product,”

“voluntary benefit,” and “personal insurance” may all be clustered to form

a single concept representation related to insurance. In this way, word

choice among different authors and contexts is normalized — words whose

appearance is positively correlated are grouped together. The disadvantage

one or more times, while the question mark (?) matches the previous expression zero or

one times.

Winter 2018

of clusters is that they are difficult to label — other than appearing as the set

of words in a given cluster, they are not human-readable.

Other approaches to normalization may involve various other

statistical and natural-language processing techniques, such as Park, Byrd,

and Boguraev (2002), who use a combination of stop word removal,

lemmatization (normalization of different forms of a word, .e.g, people and

person or colors and color), and abbreviation detection. There are various

components of a key phrase that may not be desirable in terminology

generation. Inflectional morphology — grammatical affixes such as -s and -

ing provide grammatical information within the context of a natural

language sentence, does not usually differentiate between technical terms.

A terminology should not usually extract both heat capacity and heat

capacities — these are probably both instances of the same term. Certain

functional words, including articles such as a, an, and the may also be

unhelpful. These issues can be dealt with through lemmatization and stop

word removal, both of which are well-known problems with many proposed

solutions in the field of natural language processing (Park, Byrd, and

Boguraev 2002). However, even assuming that we can lemmatize phrases

and remove stop words, there may still be undesirable redundancies in an

automatically generated terminology, as there are many ways to express the

same thing by using different syntactic structures.

Consider, for example, the syntactically similar phrases the red tree

leaves and the red leaves of the tree. In many contexts, these phrases have

approximately the same meaning — they both refer to leaves which belong

or grow on a tree and are red in color. When dealing with phrases such as

these, a terminology extraction algorithm should be able to reduce

redundancy by converting both of these terms into a normalized syntactic

pattern. A great deal of theoretical and applied linguistic and computational

research has been done regarding the determination of a sentence or

phrase’s syntactic structure. By applying dependency parsing to

terminology extraction, it becomes possible to normalize the syntactic

structure of phrases. While many other terminology generation algorithms

work with phrases, one of the main benefits of our approach is the ability to

normalize surface-level differences in the structure of these phrases.

2.2 Dependency Grammar

As mentioned above, we use dependency parsing to normalize the

syntactic structure of phrases. A dependency parser is a tool for syntactic

analysis that produces a dependency tree, which is defined in terms of the

Washington Academy of Sciences

relationship between a phrasal head and the rest of the phrase (Tesniére

1959). The phrasal head carries the syntactic category of the phrase — e.g.,

a noun phrase is headed by a noun, a verb phrase by a verb, etc.

Furthermore, a dependency represents some semantic information, as the

head of a phrase is typically specified by the rest of the phrase. For example,

the phrase the tree’s red leaves (a noun phrase) is headed by the word

leaves. The word /eaves on its own is quite general — it could refer to the

leaves of any plant whatsoever. However, because of the rest of the phrase,

we know that the leaves in question are red and that they belong to or come

from a tree.

Dependency syntax can be represented using a tree structure such as

that in Figure 2. Modern parsing technologies, such as the Stanford Parser

(Manning et al. 2014) and MaltParser (Hall 2006) are capable of

automatically generating dependency trees from text. With the help of these

tools we can take advantage of the semantic information provided by

syntactic structures and normalize the syntax of phrases to generate terms.

leaves

ne ae

tree’s red

the

Figure 2: Dependency representation of the tree’s red leaves

3. Representing Root- and Rule-Based Terms

3.1 Guidelines for Terminology Representation

Because the goal of providing a domain terminology is to produce a

list of formal concepts in the domain, it is necessary that generated terms be

both unambiguous and relevant to the domain. If terms are ambiguous, then

the terminology will be inaccurate. If terms are irrelevant, even if their

semantics are correctly represented, the terminology will not be of any

practical use. We have defined the following criteria to describe useful

terms for domain terminologies, based on rules 1 to 10in (Bhat et al. 2015)

(reproduced in Section 7) These criteria should generally apply to any

terminology generation schema.

Winter 2018

° Terms should be human-readable and machine-friendly. All terms

should be based on natural language.

° The same term representation should always identify the same

concept within a terminology; similarly, two terms with different

representations should represent different concepts (see Bhat ef al.

(2015); rules5, 7, ands).

’ The meaning and form of a term should be predictable from smaller

parts. This predictability must be applicable to both humans and

machines (rules 8, 9, and 10).

’ The form of a term should be predictable — that is, given a particular

meaning, it should be possible to derive a compositional name for a

term with that meaning (rules 2, 3, and 6).

° Given a term’s compositionality, both humans and machines should

be able to identify semantic relationships.

’ Terms should be intuitive enough that both humans and machines can

identify existing semantic relationships between them.

° Terms should be mutable enough that new terms with related

semantics can be generated (rules 3, 4, and 8)

° Only terms representing discriminating concepts should be generated

(rules 2 and 8).

° Terms representing highly specific instances and individuals should

generally not be generated; terms should be reusable in many use-

cases (rule 9).

Generally speaking, terms will represent a hierarchy, with some

concepts being more specific than others. Thus, there is no concrete

definition for “too general” or “too specific” as applied to a terminology;

we are simply looking for terms that are useful at the level of specification

provided by the domain, keeping in mind that terms should be usable for

data representation, sharing, and analysis. The domain terminology should

be representative of the input corpus.

3.3 Normalized Dependency Trees

The term representation that we have developed takes advantage of

common features of natural language to create a human-readable schema

that maintains the stipulations given above (though building more complex

structures, such as ontologies, is also dependent on other compounds, such

as the term selection strategies discussed in Section 4) and Section 8.

Washington Academy of Sciences

The theoretical framework for our term representation is the

syntactic structure of phrases and dependency grammar, as discussed in

subsection 2.2. Though dependency trees are not easily human-readable,

they can be converted into human-readable forms, due to one important

feature: given a dependency parse, it is not the order of the nodes which

determines the meaning of the phrase — all isomorphic trees have the same

meaning. The hierarchical structure of a dependency tree represents all of

the semantic information that our strategy relies on; the linear order of the

daughter nodes represents only the surface ordering of morphemes and does

not on its own contain any semantic information. That is, the trees shown in

Figure 3 and Figure 4 are semantically very similar, despite differences in

node order. Automatic dependency parsers, such as MaltParser (Hall 2006),

will produce similar trees, though we have filtered function words from

these in order to improve normalization.

leaf

i ties

tree red

Figure 3: Collapsed representation of the leaves of the tree that are red

leaf

red tree

Figure 4: Collapsed representation of the red leaves of the tree

Because of this fact we can re-order the nodes in a set of trees such

that all trees follow the same pattern. In this way trees that are different, but

represent the same concept, will generally be normalized to the same

structure. For example, if both Figure 3 and Figure 4 were changed such

that all nodes were exclusively left-branching (1.e. in a form such that the

daughters of a node all appear to the left of the node), they would be

identical. This creates a normalized dependency tree that we use to create

normalized representations of terminology.

Creating these normalized representations from key phrases is a

three-step process: first, a dependency parser creates a dependency tree for

each input phrase. Second, a filter removes all function words and other stop

words such as prepositions from the dependency trees. Third, each

dependency tree is made entirely left-branching.

Winter 2018

Though these normalized trees are good representations of phrasal

semantics, they are not easily understood by human users of a terminology.

Understanding these trees requires an understanding of dependency

grammar, a potentially non-intuitive concept. Instead, our strategy converts

these normalized trees into human-readable forms using a number of

concrete rules. These new representations are linear and can be stored as

simple strings of characters.

For example the trees in Figure 3 and Figure 4 can be linearized by

first converting the trees into normalized form (Figure 5). Once we have

normalized the syntax, we can convert the structure into a linear format,

such aS RED-TREE_LEAF. This format contains the same information as the

tree structure and is easily interpreted by English speakers and by

computers. For English speakers the linear form corresponds to a standard

English phrase with the same meaning (the red tree leaf). For computers the

underscore (“_’) indicates that the two final roots (tree and /eaf) compose

first, followed by red, which is equivalent to the structural information

shown in the tree. We have not yet discussed exactly how this linear format

is reached from the dependency tree; this, including the semantics of the

hyphen and underscore delimiters, will be explained in the following

section.

leaf

nah

red_ tree

Figure 5: Normalized tree for RED-TREE_LEAF

3.2.2 Roots and Terms

Just as the above structural representations depend on the principle

of compositionality (Frege 1884) and the formulations of compositional

semantics (Montague 1988), the linear term representation that we have

developed facilitates breaking terms apart into smaller components. Unlike

in natural language, we primarily use compounding, combining individual

“words” into larger terms. The individual meaningful components of a term

are called roots, and cannot usually be broken down into further meaningful

parts. A root should correspond to a single meaningful word such as tree,

electron, or computational. Roots can be combined in various ways to create

structured terms, which are semantically complex structures whose

meaning can be easily determined from their component parts. This is based

Washington Academy of Sciences

on rules 1, 2, and 3, as well as the specialized terminology in Bhat ef al.

(2073):

Roots can combine in different ways; each method of combination

is represented textually by a unique delimiter, including the underscore

(‘_’), the hyphen (‘-’), and the colon (‘:’). This delimiter notation is

extensible and replaceable (completely different sets of delimiters may be

used); new delimiters can be added to express new relationships between

roots. The hyphen is a general delimiter used for combining roots into terms.

Depending on the use-case of the terminology, the hyphen may have a

slightly different meaning, but it should usually be used to add specificity

to a root through the addition of a second root. For example, the term TREE-

LEAF 1s composed of the roots TREE and LEAF. The hyphen indicates that the

root /eaf (a very general term) is made more specific through the addition

of the root TREE, which indicates that the term as a whole represents a

specific type of LEAF — namely, the leaf of a tree, rather than the leaf of a

bush or other plant. Forms such as this can be easily derived from syntactic

structure: both the construction of terms from roots and the structure of

syntactic dependency indicate the modifier-head relationship between two

roots. In dependency structures, the dependencies of a root are its modifiers,

just as a root (the head) is modified by the preceding root.

The interpretation of terms with three or more roots could be

ambiguous. However, we impose a left-branching dependency syntax on all

terms, meaning that the roots in a term compose from left to right. For

example the term OAK-TREE-LEAF refers to the leaf of an OAK-TREE, and

OAK-TREE refers to a tree of type oak. The first semantic operation is the

composition of OAK and TREE. This is followed by the composition of OAK-

TREE (as a single concept) and LEAF.

Combining roots with more complex structures requires additional

delimiters. For example, as described above, the term RED-TREE-LEAF refers

to the leaf of a red tree, not to the red leaf of a tree — that is, in this term, the

tree is red but the leaf is not. The English phrase “red tree leaf” is ambiguous

in a way that the term RED-TREE_LEAF is not. This is why, if we are trying

to create a term with the interpretation the red leaf of a tree (the dependency

structure in Figure 5), it cannot be represented using hyphens alone. The

second delimiter, the underscore (‘_’) has higher precedence in the

compositional order of operations — composition of roots delimited by

underscores occurs before the composition of roots delimited by hyphens.

The latter has lower precedence in decomposition. Two roots delimited by

an underscore are called “super-roots” and allow for terms that cannot be

Winter 2018

expressed solely by hyphens. For example, “the red leaf of a tree” (Figure

5) can be represented as the term RED-TREE_ LEAF. TREE_LEAF 1s a super-

root and thus combines first: the term represents the leaf of a tree. The super-

root then combines with RED, specifying that the TREE-LEAF is of the color

red.

We define two additional methods for combining roots, though more

methods can be added based on use-case (see Figure 5). The first is to

combine the roots without any delimiter at all, referred to as creating a

“tethered root.” For example, if TREE and LEAF were to be combined into a

tethered root, they would become TREELEAF. The purpose of tethered roots

is to create roots that are composed of multiple words in English, but which

have meaning only as a whole phrase. If the components of a tethered root

are represented as roots in a term, the meaning of the term will not follow

from its component parts. Set phrases such as “gray area” (referring to a

situation that does not easily fit into preexisting categories) are strong

candidates for tethered roots. More generally, tethered roots are useful when

the component parts are not usable the same way in other terms. For

example, if GRAY-AREA is treated as a term, there should be other terms of

the form GRAY-X where “gray” has the same meaning as in GRAY-AREA.

However, because this is not possible, it is preferable to create a tethered

root: GRAYAREA.

The last delimiter we describe in this paper is the colon (‘:’). Two or

more terms can be combined into compounded terms with this delimiter.

A compounded term represents a high-level semantic cluster, though the

usage of these terms is dependent on the use-case. While most terms

represent concepts, compounded terms can also represent relationships

between those concepts. For example, because APPLE-TREE and

RASPBERRY-PLANT both refer to plants that bear fruit, the compounded term

APPLE-TREE:RASPBERRY-PLANT could represent this fact about the two

component terms. The exact type of relationship is not specified by the

compounded term, which describes only a very general semantic connection

between its components.

All of these combinations can be generated automatically from a list

of key phrases using dependency parsing and a training set. Roots are

usually equivalent to nodes in a dependency structure. Because of this, they

can be combined into super-roots and into terms by examining an

automatically generated dependency tree, removing unimportant words,

and performing automatic lemmatization. Creating tethered roots and

compounded terms cannot be done with dependency trees alone, and

Washington Academy of Sciences

requires the use of a training set. Tethered roots are formed when splitting

the term does not provide any reusable information. This can be measured

using statistics such as term frequency-inverse document frequency (a

measure of a words importance in a document relative to its overall

frequency) (Wu et al. 2008). Compounded terms can be identified using

measurements of co-occurrence frequency, which identify semantic

relationships between terms (Kostoff 1993). Because terms are

unambiguous, and different relationships between roots are represented by

different delimiters, a machine can also easily break down a term into its

component parts, just as it can build up a term based on the relationships

between the components (Table 1).

Table 1. Summary of term syntax

Root Type (delimiter) Description Example

Term (-) Composite Concept OAK-TREE

Root Single Concept TREE

Super-Root High-precedence Composite TREE LEA

Tethered Root Multi-word Single Concept GRAYAREA

Compounded Term (:) Two related terms APPLE-

TREE:RASPBERRY-

PLANT

4. Key Phrase Extraction for Root- and Rule-Based Terms

In Sections 2, 3, and 4 we have discussed how to generate structured

terms using a root- and rule-based approach taking advantage of syntactic,

semantic, and statistical cues. Though structured terms are useful

representations of concepts in a terminology, and though natural language

processing and other statistical tools can convert key natural language

phrases into structured terms, we have not yet discussed how these key

phrases are selected. It is possible, of course, to manually select key phrases

to be used in a terminology. In some cases, this is unavoidable, as there will

always be disagreement as to what constitutes an important term within a

domain, but it is helpful if at least some of the work can be done with an

automated system. The automated system may be helpful in providing an

empirical basis for coming to agreement.

Winter 2018

The study of automatic terminology extraction is a major area in the

field of information extraction (Witschel 2005). Most methods of

terminology extraction rely on two components: a unithood metric and a

termhood metric. A unithood metric determines the particular types of

words and phrases that make potential candidates for terms. A unit is not

necessarily the final representation of a term, nor are all units relevant

enough to be treated as terms. For example, a unithood metric might

consider all noun phrases (such as “the red leaf of a tree”) in a corpus to be

valid term candidates. The task of a termhood metric is to determine which

of the candidate terms are important enough within a document to be a part

of the terminology. Together, a unithood metric and a termhood metric can

extract all of the salient words and/or phrases from a document.

Most terminology extraction methods combine statistical and formal

methods. Unithood metrics are often based partially on linguistic features

such as part of speech. For example, it is uncommon to include isolated

prepositional phrases in a unithood metric (though prepositional phrases

that are included in other phrasal categories may be included). Termhood

metrics usually analyze the frequency of a term in a document with respect

to its frequency in a collection of documents in order to determine the extent

to which the term represents the content of the document. However,

termhood metrics can also take into account linguistic features; for example,

nouns with Greek or Latin endings such as /itis or /scopy may be more likely

to be technical terms in certain domains (Witschel 2005).

The proposed methods of syntactic analysis described in Section 2

can take as input a series of phrases extracted from a document or corpus

and convert them into structured terms. However, this algorithm is sensitive

to the terminology extraction techniques used, as it is dependent on the

interface between syntax and semantics. If the input phrases are too short to

provide semantic clarity, the output terminology will be too general. If the

input phrases are too long (with respect to the number of words), the output

terminology will be too specific. Many terminology extraction algorithms

only extract single morphemes or words — that is, they output terms such as

“solar” or “photovoltaic” (Witschel 2005). However, our proposed system

prefers units with two to five content words, as longer or shorter terms will

tend to be either too general or too specific for most use-cases. Longer terms

are more specific, and will often introduce nuances that are not necessary in

domain terminologies.

Bearing these restrictions in mind, there are still terminology

extraction algorithms that cater to the needs of structured terminologies.

Washington Academy of Sciences

Methods such as the NC-Value Algorithm (Frantzi, Ananiadou, and Tsujii

1998) are designed for extracting multi-word terms and their algorithm can

easily be extended to favor two- to five-word phrases in order to generate

the most effective structured terms.

Though our proposed system is sensitive to terminology extraction,

the exact algorithm used is an implementation detail that can be changed

easily, as discussed in Section 5. Different use cases may choose different

terminology extraction algorithms, depending on their needs. The root- and

rule-based approach that we describe is not specific to any terminology

extraction algorithm, and the exact method can be customized according to

the use case.

The root- and rule-based approach we propose also provides a

significant advantage for terminology extraction. Because root-based terms

can easily be broken down into their component pieces, it is possible to

compare two terms and find similarities between them. Because of this, it

is possible to use previously generated normalized terms as hints for term

selection. For example, given that the term RED-TREE_LEAF 1s salient in a

corpus, the terms RED-TREE BRANCH, GREEN-TREE LEAF, and RED-

BUSH_LEAF are probably salient as well, as they share much of the same

information.

5. Extensibility

The previous sections describe the various methods that go into

terminology generation. The major components are salient phrase

extraction (Section 4) and converting key phrases to structured terms

(Section 2 and Section 3). However, using this model in a complete system

is more complex.

One of the primary benefits of this root- and rule-based approach is

the compositional form of terms. Based on this approach, it is possible to

build an extensible and modular system that can be adjusted to suit different

needs. In this section we describe how the system as a whole can be

configured through different modules and extensions, and how this is

enabled by the rule-based model.

Figure 1 shows the various processes involved in generating terms

from a corpus of documents. These processes interact with three different

types of data: the corpus, the set of key phrases, and the terminology (stored

in the database). The corpus may be either a large set of documents used to

initialize the terminology, or a smaller set of documents introduced through

Winter 2018

a user interface, as discussed in Section 6. The set of key phrases are the

salient phrases extracted from the corpus; key phrases are natural language

phrases that have not yet been processed by the structuring methods

described in Sections 3 and 4. The terminology consists of any pre-

generated terms, which can be used as a training set. During the term

generation process, new terms are added to the existing terminology.

Working with these data requires many different tools and sub-

processes - key phrase extraction, tethered root generation, super root

generation, lemmatization, and term generation. A key feature of the design

is that these tools are not necessarily co-dependent, and can thus easily be

substituted depending on the needs of users or on advancements in the

technologies that constitute each component.

5.1 Key Phrase Extractor

The key phrase extractor has the task of extracting salient phrases

from the corpus. As discussed in Section 4, there are many different

algorithms that can handle this task. As such, it is possible to entirely replace

the key phrase extractor that is used in a root- and rule-based terminology

generation system.

The key phrase extractor may also take advantage of any terms

already in the term database by treating them as a training set. Different

terminology extraction algorithms may use this training data in different

ways. For example, some applications may only wish to include very close

matches with preexisting terms, while others may choose to be more liberal

with key phrase extraction. This allows different use-cases to use the

preexisting terms as appropriate.

5.2 Tethered Root Generator

Tethered roots (see Section 3) may be generated based on two major

components of the terminology generation system: the set of key phrases,

and the set of preexisting terms. A tethered root generator may use statistical

models to determine the information content of a given root relative to the

set of all key phrases, or it may use other tethered roots in preexisting terms

as cues to generate tethered roots from the current corpus. Again, this

component can be customized according to the use-case. One possibility is

using Shannon entropy (Shannon 1948) to identify sequences that add less

information to a dataset when split up into multiple roots than they would if

a tethered root were used instead. For example, if OAK-TREE-LEAF provides

less information than OAKTREE-LEAF, then OAKTREE-LEAF could be used

Washington Academy of Sciences

instead. Ideally, a tethered root generator will consider not only how much

information is contained in each variant, but also whether the information

is misleading or inconsistent.

5.3 Super Root Generator

Super root generation requires much of the same information as

tethered root generation, except that roots are more likely to make sense

when considered separately. Super root generation may also take advantage

of a dependency parser in order to determine super roots based on syntactic

Structure.

5.4 Lemmatizer

In order to avoid creating unnecessary terms, roots are lemmatized

to avoid codifying differences in grammatical form. There are many

different ways that words can be lemmatized, and lemmatization is a non-

trivial task in computational linguistics (Sharma 2010). One common

method is to use lexical databases such as WordNet (Fellbaum 1998), as in

our forthcoming reference implementation of root- and rule-based

terminology generation.

5.5 Term Generator

A term generator combines the roots that make up each key phrase

into a structured term based on the results of a dependency parser and the

methods described in Section 2 and Section 3. The rules in the rule-based

system we describe are not static and can be changed by users,

administrators, or developers when needed to improve the system’s

performance at its given task.

Altogether, these tools and sub-processes come together to form a

model of terminology generation that is customizable, takes advantage of

both linguistic and statistical facts, and is at the same time both machine-

and user-accessible.

5.6 Example

An example of the entire terminology generation process is shown

below, to illustrate how these components come together. This example

begins with the following document, taken from Overton Jr and Gaffney

(1955) (https://materialsdata.nist.gov/dspace/xmlui/handle/1 1256/79),

from which terms will be extracted.

Winter 2018

The ultrasonic pulse technique has been used in conjunction with a

specially devised cryogenic technique to measure the velocities of

10-Mc/sec acoustic waves in copper single crystals in the range from

4.2K to 300K. The values and the temperature variations of the

elastic constants have been determined. The room temperature

elastic constants were found to agree well with those of other

experimental works. Fuchs’ theoretical c44 at OK is 10 percent

larger than our observed value but his theoretical cll, cl2, K and

(cll—cl2) agree well with the observations. The isotropy, (cl1—

c12)2c44, was observed to remain practically constant from 4.2K to

180K, then to diminish gradually at higher temperatures. Some

general features of the temperature variations of elastic constants are

discussed.

A key phrase extractor then determines the most salient phrases in the

corpus and extracts them. Key phrase are shown in bold and underlined.

The ultrasonic pulse technique has been used in conjunction with

a specially devised cryogenic technique to measure the velocities of

10-Mc/sec acoustic waves in copper single crystals in the range

from 4.2K to 300K. The values and the temperature variations of

the elastic constants have been determined. The room temperature

elastic constants were found to agree well with those of other

experimental works. Fuchs’ theoretical c44 at OK is 10 percent

larger than our observed value but his theoretical cll, c12, K and

(cll—cl2) agree well with the observations. The isotropy, (cl 1—

c12)2c44, was observed to remain practically constant from 4.2K to

180K, then to diminish gradually at higher temperatures. Some

general features of the temperature variations of elastic constants are

discussed.

Once these key phrases have been extracted, they need to be

converted into normalized root- and rule-based terms. This involves

parsing, lemmatizing, and structuring the phrases according to the methods

discussed in Sections 2 and 3.

For example the phrase temperature variations of the elastic

constants should be parsed and converted into the following dependency

tree (Figure 6): |

Washington Academy of Sciences

variations

temperature of

constants

vil

elastic

the

Figure 6: Dependency representation of temperature variations of the

elastic constants

The words are then lemmatized and the syntactic structure

normalized such that the tree is entirely left-branching. At this time

unimportant function words such as of are also removed. This results in

Figure 7

variations

ene

temperature oot

say

constants

elastic

the

Figure 7: Normalized representation of temperature variations of the

elastic constants

Based on this structure, the system should generate the term

ELASTIC-CONSTANT-TEMPERATURE_ VARIATION. Because variation has two

branches in Figure 6, one of them must be used to generate a super-root in

order to preserve unambiguity. This yields TEMPERATURE_VARIATION

which can then compose normally to create the complete term ELASTIC-

Winter 2018

CONSTANT-TEMPERATURE VARIATION. The other key phrases can be put

through this same process, yielding ULTRASONIC-PULSE-TECHNIQUE,

COPPER-SINGLE CRYSTAL, ELASTIC-CONSTANT-TEMPERATURE_VARIATION,

and ROOM-TEMPERATURE-ELASTIC_CONSTANT, which are inserted into the

database as valid terms. Some discrepancies may result from this strategy;

for example, the above terms include both the structures ELASTIC-CONSTANT

and ELASTIC_CONSTANT. Such ambiguities are resolved through the use of

a training set or manual curation. For example if ELASTIC_CONSTANT is

found in the training set, the system can resolve this conflict.

5.7 Performance and Usage of Root- and Rule-Based Terms

The previous sections demonstrate how root- and rule-based terms

can be constructed using phrases extracted from natural language texts.

However, we have yet to analyze the performance of root- and rule-based

terms as data structures. One of the major advantages of these structures is

that they are capable of being constructed automatically using linguistic and

statistical methods, but the structure of the terms themselves provides

additional performance and usability gains for many tasks.

The example given in Section 5.6 shows how a small set of root- and

rule-based terms can be generated from a single document. If this process

is repeated over a larger sample of documents, the result is a large

terminology representing concepts in a particular domain. In order to

analyze this terminology, we examine the data that are represented by each

term.

To begin with, root- and rule-based terms are typically human-

readable and understandable. The terms COPPER-SINGLE_ CRYSTAL and

ULTRASONIC-PULSE-TECHNIQUE are fairly simple to understand, assuming

that the component parts are understood. Even without knowing what

ultrasonic means, it is still possible to gather that ULTRASONIC-PULSE-

TECHNIQUE refers to a technique involving ultrasonic pulses. To a human,

root- and rule-based terms may not always seem unambiguous -

ELASTIC_CONSTANT-TEMPERATURE_ VARIATION may initially be perceived

as referring to something of constant temperature rather than to the

temperature variations of the elastic constant. However, the correct

interpretation is typically apparent in English-language terms, as English

Washington Academy of Sciences

tends to follow a head-final structure’ (which is imposed on all root- and

rule-based terms).

From a computational perspective, root- and rule-based terms are

syntactically unambiguous. The term ELASTIC_CONSTANT-

TEMPERATURE_VARIATION refers specifically to the temperature variation

of the elastic constant; it cannot refer to elastic variations of constant

temperature or to any other alternative referents. This has several

implications for root- and rule-based terms. Firstly, two root- and rule-based

terms which contain the same roots but have different structure can be used.

The term ROOM-TEMPERATURE-ELASTIC CONSTANT is distinct from the

term ROOM-temperature-ELASTIC-CONSTANT - the former refers to the

elastic constant at room temperature, while the latter refers to a constant

relating to room temperature elastic (e.g. elastic material held at room

temperature). Because these two terms have distinct meanings and distinct

structures, both can be represented if necessary, and both can be generated

from natural language. Secondly, the structure of root- and rule-based terms

implies a larger semantic structure.

The two terms ROOM-TEMPERATURE-ELASTIC_CONSTANT and

ABSOLUTEZERO-ELASTIC_CONSTANT both contain the — super-root

ELASTIC CONSTANT and both refer to types of elastic constant, i.e. the

elastic constants at room temperature and the elastic constants at absolute

zero. However, other terms, such as ELASTIC_CONSTANT-

TEMPERATURE_ VARIATION also contain ELASTIC_CONSTANT but do not refer

to types of elastic constant. Instead, they refer to a type of temperature

variation. This can be determined from the structure of root- and rule-based

terms, allowing a computer to generate term hierarchies and semantic maps.

The structural and hierarchical nature of root- and rule-based terms

also allow for more powerful searches and analyses of data. Documents can

be indexed not just by the words that occur in them, but by salient concepts

they discuss. This allows a document describing ROOM-TEMPERATURE-

ELASTIC_CONSTANT to be distinguished from one describing ROOM-

TEMPERATURE-ELASTIC-CONSTANT - even though the roots that make up

these terms are the same, the two documents are describing different

concepts. Thus, a user searching in a collection of documents can specify

whether they are hoping to find documents on ROOM-TEMPERATURE-

ELASTIC_CONSTANT or _ on ROOM-TEMPERATURE-ELASTIC-CONSTANT.

3 In a head-final structure, the head phrase follows its dependents. For example, the noun

in a noun phrase follows the adjectives which modify it.

Winter 2018

Furthermore, because ROOM-TEMPERATURE-ELASTIC_CONSTANT is more

specific than just ELASTIC_CONSTANT, a user may also be able to search for

more general concepts as well. Similarly, given a general term such as

ELASTIC CONSTANT, a system can determine more specific related terms

such as ROOM-TEMPERATURE-ELASTIC_CONSTANT and ABSOLUTEZERO-

ELASTIC_CONSTANT.

6. Applying Root- and Rule-Based Terminologies

In previous sections we have discussed how we propose to build

domain terminologies using a root- and rule-based approach. We have

described how an algorithm can convert phrases of natural language into

structured terms, how key phrases can be extracted, and how this system

can be extended and modularized. In this section, we discuss why the root-

and rule-based approach we have proposed facilitates the creation of useful

terminologies.

It is non-trivial to measure the correctness of a domain terminology.

Standard metrics such as precision (the percentage of the output answers

that are desirable) and recall (the percentage of all desired answers that are

actually contained in the output) may not accurately assess problems

without definite answers. Terminologies are only desirable insofar as they

represent sets of useful concepts relating to a domain; different uses may

lead to different notions of desirability. A terminology does not represent

every possible concept used in a domain — instead, it represents only those

concepts that are of appropriate specificity for practical purposes,

depending on the use case.

This notion of practical validity does not have an objective definition

that can be directly measured. Instead, the use determines the validity of

terminology in a particular context. For example, a terminology that is based

entirely on taxonomy (a scheme of hierarchical categorization) may be

useful for some tasks, but for others, it may be desirable to represent

information about the properties of terms using a different scheme of

classification. In some cases, for example, it may be useful to know that RED

is a type of COLOR, while in others it will instead be useful to know that a

BALLOON has the property COLOR with value RED. The Root- and rule-based

approach and be adapted to support new ways of classification.

More generally, terminologies need to be available on demand to

many types of interaction with users. For example, in addition to end-users,

a terminology needs to be easily maintained by administrators. The

Washington Academy of Sciences

implementation of a terminology may be very efficient on the user end,

obtaining the results of search queries very quickly, but be inefficient on the

administration end. This is often a detail of implementation, but can be

relevant to the formulation of the terminology model.

The root-and-rule-based model that we have described is designed

to meet the needs of a variety of general use-cases and be configurable

enough to meet the needs of more specific situations. We have considered

four general use-cases in our proposal: document entry, document retrieval,

curation, and rule changes. These use cases assume a centralized database

containing the core terminology. The relationships between these use cases

and the data are shown in Figure 8. The terminology generator in Figure 8

is shown is the same system shown in Figure 1. The desirable components

of a ara system such as this are described in Section 1.2

neal Reiievel trieval

eta e

User

Document Entry | = a ‘Cusation uration

tat -rfac Interface

Term Database

Curators

—_— |

Term Generator Initial Corpus

[File Ghana Cc ea ee

Inte ea ee ce

Figure 8. A strategy for use-cases to create and manage root- and rule-

based terminologies

6.1 Document Entry Interface

We have proposed that a document entry interface should provide

functionality that allows a user to upload a document to the system. The

system should then determine the terms in the document that match pre-

existing terms, extract any new terms from the document, and allow the user

to edit these terms.

Winter 2018

We propose that a user should be able to edit the results of document

entry by removing those terms that do not apply to the uploaded document,

or by adding new terms that the algorithm did not find.

A document entry interface is dependent on the ability to quickly

identify terms in a single document. That is, an automated system that

allows for single document entry must be robust to corpora containing few

documents. The terminology must be able to be built one document at a

time, with no dependency on large corpora. Our proposed root-and rule-

based system would take advantage of redundant methods of terminology

extraction in order to work with different-sized corpora. In addition to

extracting terms with a more general-purpose terminology extraction

algorithm (see Section 4), our proposal also takes advantage of the

structured nature of terms to find terms in new documents that are

structurally similar to previous terms. This often allows the root- and rule-

based method to identify useful new terms even without statistical evidence.

6.2 Document Retrieval Interface

In our proposed document search and retrieval interface, a user

should be able to enter one or more terms and receive a listing of documents

containing the given terms. In the simplest case the user simply inputs a list

of terms, and the system locates all of the documents in the database

containing the given terms. More complex search systems are also possible,

allowing for additional refinement of search criteria.

Structured terms improve the potential of search systems. The

compositional nature of terms in this model means that users can make

semantic searches, rather than simply searching for the presence of a

collection unrelated words. That is, instead of searching for a document that

contains the words “red”, “tree”, and “leaf”, a user can easily identify the

exact concept in question and search for RED-TREE_ LEAF. This allows the

user to make much more succinct and semantically rich search queries,

producing narrower and more relevant result sets.

The system by which terms are generated from natural language

phrases can also be used to improve the usability of user search interfaces.

Rather than requiring that users search directly for structured terms in the

database, which requires an understanding of the way that root- and rule-

based terms are formed, the interface can instead allow users to input

phrases of natural language. For example, if the user inputs “the red leaves

of a tree”, the interface can quickly generate suggested terms, such as RED-

Washington Academy of Sciences

“|

TREE LEAF, meaning that users only need a passive knowledge of term

structure in order to make advanced use of the interface, while still allowing

for unambiguous searches.

6.3 Curation Interface

A curation interface allows a team of curators to make changes to

the terminology. Curators may either be a select team of administrators, or

volunteer end-users. That is, it is possible in some cases to crowd-source

curation. Depending on the resources available for a particular system, it

may be desirable to have dedicated moderators or to allow end-users to

make their own changes to the database. We make no suggestions as to

which is more generally preferable, but instead describe a terminology

system that is able to handle both.

Curation may be partially dependent on search, as discussed above,

in subsection 6.2, as curators need to be able to locate terms to change.

However, curators may benefit from more than a document retrieval system,

as they should be able to examine the complete structure of the terminology.

For example curators may wish to view taxonomic relationships between

terms in order to ensure that the taxonomy is structured correctly. Curators

should be able to make changes to the structure, as well as to individual

terms in the terminology.

A root- and rule-based terminology interacts well with this sort of

curation interface. The semantic nature of terms can be used to determine

the taxonomic structure of the terminology implicitly (as discussed in

Section 2, modifiers (roots other than the head) add specificity, and

unmodified terms are equivalent to hyponyms relative to their modified

variants). Additional relationships between terms are represented through

compounded terms (semantic clusters represented by terms that have been

combined using the delimiter ‘:’), allowing for graph-based visualizations,

such as that shown in Figure 9. A visualization such as this might be derived

from the following two terms: ELECTRON-CYCLOTRON-

CURRENT_DRIVE:NONINDUCTIVE-CURRENT_DRIVE and ION-CYCLOTRON-

CURRENT_DRIVE. By parsing these terms into roots, an algorithm or a user

can easily determine that both terms represent types of CURRENT_DRIVEs,

and that ELECTRON-CYCLOTRON-CURRENT DRIVE and ION-CYCLOTRON-

CURRENT DRIVE are types of CYCLOTRON-CURRENT_DRIVE. Furthermore,

the user interface can show that there is some relationship between

NONINDUCTIVE-CURRENT_DRIVE and ELECTRON-CYCLOTRON-

Winter 2018

CURRENT DRIVE. This information is all determined from the structure of

these three terms, and the composition of the roots.

CURRENT DRIVE

CYCLOTRON-CURRENT_DRIVE NONINDUCTIVE-CURRENT_DRIVE

ELECTRON-CYCLOTRON-CURRENT_DRIVE ION-CYCLOTRON-CURRENT_ DRIVE

Figure 9. Visualizing Term relationships

The implicit and explicit relationships between terms in this model

allow for changes to terms without requiring manual restructuring of the

hierarchical structure. For example, if the system erroneously created the

term RED-LEAF_TREE instead of RED-TREE_LEAF, a curator could make this

change without needing to manually relocate the term to its proper

taxonomic position beneath LEAF. Instead, the curator only needs to change

the term RED-LEAF TREE to RED-TREE_LEAF and the fact that RED-

TREE_LEAF Is a type of LEAF can be automatically inferred.

6.4 Rule Change Interface

Rule change interfaces are desirable in many evolving situations.

Due to the constant changes in knowledge and vocabulary that occur in all

domains, it is sometimes necessary to update the way that the rules are used

to generate terms. This interface may be particularly useful during the

infancy of this technology. In our proposal, it may be desirable to create

new delimiters with new meanings, change the behavior of delimiters, or

change the way that terms are selected. In addition, it may become feasible

to make changes due to improvements in technologies. In our proposal, this

process does not require a complete restructuring of the system. Instead,

only an interface which allows for the replacement of individual

components of the system, as described in Section 5, is necessary.

Washington Academy of Sciences

6.5 Generating Ontologies

As discussed in Section 1, one of the motivations for a root- and

rule-based method of terminology generation is practical ontology

extraction for the semantic web as well as identifying, tracking, and

predicting changing trends. The root- and rule-based terminologies we have

discussed are well-adapted for use in ontologies. Though the methods we

describe do not immediately output a complete ontology, the structured

nature of terms in root- and rule-based terminologies means that a

terminology can easily be extended into the skeleton of an ontology. As

shown in Figure 9, some of the relationships between terms can be inferred

from term structure. These relationships are primarily taxonomic, but

compounded terms reveal additional connections that may inform the

structure of a complete ontology.

Some of the relationships that structured terms encode are

underspecified. For example, the term ELECTRON-CYCLOTRON-

CURRENT_DRIVE:NONINDUCTIVE-CURRENT_ DRIVE entails that there is a

relationship between ELECTRON-CYCLOTRON-CURRENT DRIVE — and

NONINDUCTIVE-CURRENT_ DRIVE. However, it does not specify what that

relationship is. An ontology may need to supply this additional information,

but the presence of the relationship is already known. Because of this, a

root- and rule-based terminology provides many of the relationships

necessary for a domain ontology. A system could generate a simple

ontology by adding labels to these relationships and adding any relevant

edges that may have been left out in terminology generation.

Ontologies are usually dependent on the terminologies that inform

them. By basing ontologies on root- and rule-based terminologies, it is

possible to create ontologies that are human- and machine-readable.

Because language, including the vocabulary used in most domains, is

constantly evolving, ontologies also need to evolve. Because root- and rule-

based terminologies are adaptable to new needs and to evolving vocabulary

as discussed in Section 5, they are superior to ad hoc terminologies for

constructing practical domain ontologies.

It is possible to partially automate the creation of domain ontologies

from terminologies. Statistical analyses such as co-word analysis (Kostoff

1993; Coulter et al. 1996; Coulter, Monarch, and Konda 1998) can suggest

relationships among the terms in a terminology, though not label these

relationships. These methods associate relationships between terms which

occur together within a fixed window. For example, if the terms ELECTRON-

Winter 2018

CYCLOTRON-CURRENT_ DRIVE and NONINDUCTIVE-CURRENT DRIVE occur

together more often than is expected based on the frequencies of the

individual terms, then it is likely that there is a semantic connection between

them. The above analyses do not automatically label these relationships, but

crowd-sourced methods can help to assign labels to common relationships

and create training sets for future automatic labeling techniques. Though

there is currently no fully-implemented application that extends root- and

rule-based terminologies into domain ontologies, a system that uses a root-

and rule-based approach as well as co-word analysis and crowd-sourced

labeling is currently in development. Moreover, the terminologies generated

by this approach can be used with current applications, both commercial

and freely available, to produce terminological networks much like the ones

produced by co-word analysis using tools such as Leximancer and Gephi

(Smith and Humphreys, 2006; Mathieu Bastian 2009).

7. Conclusion

Our root- and rule-based approach present several advantages for the

development of domain-based terminologies that are not available in semi-

structured models, while still maintaining both human- and machine-

readability. The primary advantage of root- and rule-based terms is that they

allow one to consistently and clearly represent important domain concepts.

Root- and rule-based terms are compositional, allowing for the division of

terms into their component parts for searching, selecting new terms, or

deriving relationships between terms.

Our proposed root- and rule-based model is also highly modular,

meaning that different users can easily adapt and maintain terminological

systems. Components may be updated with technological advances, the

introduction of new uses, or adaptations based on how systems are being

used in real-world situations. The model is also designed with many

important use-cases in mind, including search, document uploading, and

curation. This allows the system to be practical for the needs of different

users, for the administration, and for developers and scientists hoping to

expand upon a root- and rule-based system.

The model presented here is linguistically motivated, and follows

from many aspects of linguistic theory, including syntax, semantics, and

pragmatics, allowing it to connect on a fundamental level with the way that

humans actually use language, rather than with mathematical constructs

transparent only to machines. Like language itself, this model is

Washington Academy of Sciences

evolutionary, use-based, and compositional, designed with practical needs

rather than purely theoretical constructs in mind.

References

Berners-Lee, Tim, James Hendler, and Ora Lassila. 2001. “The Semantic

Web.” Scientific American.

Bhat, Talapady N. 2010. “Building Chemical Ontology for Semantic Web

Using Substructures Created by Chem-BLAST.” J/nternational Journal

on Semantic Web and Information Systems 6 (3): 22-37.

Bhat, Talapady N., Laura M. Bartolo, Ursula R. Kattner, Carelyn E.

Campbell, and John T. Elliott. 2015. “Strategy for Extensible,

Evolving Terminology for the Materials Genome Initiative Efforts.”

Journal of Materials, no. 8: 1866-75.

Clark, Herbert H., and Deanna Wilkes-Gibbs. 1986. “Referring as a

Collaborative Process.” Cognition, no. 22: 1-39.

Coulter, Neal, Ira Monarch, and Suresh Konda. 1998. “Software

Engineering as Seen Through Its Research Literature: A Study in Co-

Word Analysis.” Journal of the Association for Information Science

and Technology 49 (13). New York, NY: John Wiley & Sons, Inc.:

1206-23.

Coulter, Neal, [ra Monarch, Suresh Konda, and Marvin Carr. 1996. “An

Evolutionary Perspective of Software Engineering Research Through

Co-Word Analysis.” CMU/SEI-95-TR-019. Pittsburgh, PA: Software

Engineering Institute, Carnegie Mellon University.

Feigenbaum, Lee, Ivan Herman, Tonya Hongsermeier, Eric Neumann, and

Susie Stephens. 2007. “The Semantic Web in Action.” Scientific

American.

Fellbaum, Christiane, ed. 1998. WordNet: An Electronic Lexical

Database. Cambridge, MA: MIT Press.

Frantzi, Katerina T., Sophia Ananiadou, and Jun-ichi Tsuji. 1998. “The

C-Value/NC-Value Method of Automatic Recognition for Multi- Word

Terms.” In Proceedings of the Second European Conference on

Research and Advanced Technology for Digital Libraries, 585—604.

London: Springer-Verlag.

Frege, Gottlob. 1884. The Foundations of Arithmetic. 1980th ed.

Evanston, IL: Northwestern University Press.

Winter 2018

Hall, Johan. 2006. “MaltParser: An Architecture for Labeled Inductive

Dependency Parsing.” Master’s thesis, Vaxj6 University.

Hearst, Marti A. 1992. “Automatic Acquisition of Hyponyms from Large

Text Corpora.” In Proceedings of the 14th Conference on

Computational Linguistics, 2:539-45.

Hudson, Richard A. 2004. “Are Determiners Heads?” Functions of

Language 11 (1): 7-42.

Knechtel, Martin, and Rafael Pefialoza. 2010. “A Generic Approach for

Correcting Access Restrictions to a Consequence.” In 7th Extended

Semantic Web Conference. The Semantic Web: Research and

Applications.

Koivunen, Marja-Riitta, and Eric Miller. 2001. “W3C Semantic Web

Activity.” In Semantic Web Kick-Off in Finland: Vision, Technologies,

Research, and Applications, 27-44. Helsinki Institute for Information

Technology.

Kostoff, Ronald N. 1993. “Co-Word Analysis.” In Evaluating R&D

Impacts: Methods and Practice, 63—78. Springer.

Liu, Xueqing, Yangqiu Song, Shixia Liu, and Haixun Wang. 2012.

“Automatic Taxonomy Construction from Keywords.” In ACM

Conference on Knowledge Discovery and Data Mining (Kdd 2012).

Beying, China.

Manning, Christopher D., Mihai Surdeanu, John Bauer, Jenny Finkel,

Steven J. Bethard, and David McClosky. 2014. “The Stanford

CoreNLP Natural Language Processing Toolkit.” In Proceedings of

52nd Annual Meeting of the Association for Computational

Linguistics: System Demonstrations, 55—60.

Mathieu Bastian, Mathieu Jacomy, Sebastien Heymann. 2009. “Gephi: An

Open Source Software for Exploring and Manipulating Networks.” In

International Aaai Conference on Weblogs and Social Media, 361-62.

Association for the Advancement of Artificial Intelligence.

“Modeling Across Scales.” 2015. Warrendale, PA: The Minerals, Metals

& Materials Society.

Montague, Richard. 1988. “The Proper Treatment of Quantification in

Ordinary English.” In Philosophy, Language, and Artificial

Intelligence, 2:141—62. Studies in Cognitive Systems. Amsterdam:

Springer Netherlands.

Washington Academy of Sciences

Nivre, Joakim. 2003. “An Efficient Algorithm for Projective Dependency

Parsing.” In Proceedings of the 8th International Workshop on

Parsing Technologies (Iwpt 03), 149-60. Nancy, France.

Overton Jr, W.C., and J. Gaffney. 1955. “Temperature Variation of the

Elastic Constants of Cubic Elements.” Phys. Rev. 98: 969-77.

Park, Youngja, Roy J. Byrd, and Branimir K Boguraev. 2002. “Automatic

Glossary Extraction: Beyond Terminology Identification.” In

Proceedings of the 19th International Conference on Computational

Linguistics (Coling ’02), 1:1—7.

Plant, Anne L., John T. Elliott, and Talapady N. Bhat. 2011. “New

Concepts for Building Vocabulary for Cell Image Ontologies.” BMC

Bioinformatics 12 (487).

Polguére, Alain, and Igor Mel’éuk, eds. 2009. Dependency in Linguistic

Description. Vol. 3. Studies in Language Companion Series. John

Benjamins.

Proux, Denys, Francois Rechenmann, Laurent Julliard, Violaine Pillet, and

Bernard Jacq. 1998. “Detecting Gene Symbols and Names in

Biological Texts: A First Step Toward Pertinent Information

Extraction.” Genome Informatics 9 (72-80).

Rindflesch, Thomas C., Lawrence Hunter, and Alan R. Aronson. 1999.

“Mining Molecular Binding Terminology from Biomedical Text.” In

Proceedings of the Amia 1999 Symposium, 127-31.

Shannon, Claude E. 1948. “A Mathematical Theory of Communication.”

Bell System Technical Journal 27 (3): 379-423.

Sharma, Deepik. 2010. “Stemming Algorithms: A Comparative Study and

Their Analysis.” [International Journal of Applied Information Systems

4 (3). New York: Foundation of Computer Science: 7-12.

Smith, Andrew E., and Michael S. Humphreys. 2006. “Evaluation of

Unsupervised Semantic Mapping of Natural Language with

Leximancer Concept Mapping.” Behavior Research Methods 38 (2):

262-79.

Swartz, Aaron. 2013. Aaron Swartz’s a Programmable Web: An

Unfinished Work. Morgan & Claypool.

Tesniére, Lucien. 1959. Eléments de Syntaxe Structurale. Paris:

Klincksieck.

Winter 2018

Witschel, Hans Friedrich. 2005. “Terminology Extraction and Automatic

Indexing - Comparison and Qualitative Evaluation of Methods.” In

Proceedings of Terminology and Knowledge (Tke), 1-12.

Wu, Ho Chung, Robert Wing Pong Luk, Kam Fai Wong, and Kui Lam

Kwok. 2008. “Interpreting TF-IDF Weights as Making Relevance

Decisions.” ACM Transactions on Information Systems (TOIS) 26 (3):

1-37.

Yoshida, M., Y. Sakamoto, H. Takenaga, S. Ide, N. Oyama, T. Kobayashi,

and Y. Kamada. n.d. “Rotation Drive and Momentum Transport with

Electron Cyclotron Heating in Tokamak Plasmas.” Phys. Rev. Lett.

103 (6). American Physical Society.

Washington Academy of Sciences

Appendix A: Rules

The enumerated rules listed below were originally published in Bhat ef al.

(2015).

Forming roots:

Use all roots in singular form except where plural form is used

more frequently.

Avoid using special characters (such as’: _ - =/\) asa part of

a root.

Avoid the use of modifiers as roots.

Use abbreviations only when they are widely accepted across

many related disciplines and when they are unambiguous in

their meaning. See Rule [itm:ambiguities] for exceptions when

acronyms are embedded in a super root. Use uppercase for all

acronyms except for atomic symbols.

For similar expressions choose a shorter equivalent as a root.

Forming super roots:

A super root is formed when the roots involved do not have a

preferred discriminating power and semantics to serve as node names

of a data-graph or as RDF elements except in special circumstances.

Super roots are concatenated by an underscore to indicate its

compound semantics and its ability to be parsed into individual

roots only under unusual conditions. Ifa super root is comprised

of roots that are not specific when considered individually, then

refer to tethered roots (see Rule [itm:tethered]).

When a root of a super root functions like a hierarchical

classifier to another root then also include the classified root in

the super root so that automated parsers can recognize the

hierarchy. To order roots within a super root, unless there

already exists a well-accepted alternate convention, use rule

[itm:ordering].

Forming tethered roots:

a) Create tethered roots when a root is a qualifier of another root and

the semantics of any root on its own may not be of interest in a

database or data repository search. Tethered roots are formed to

indicate that the roots involved need be considered collectively,

rather than individually, in order to derive their semantics. For this

Winter 2018

reason, roots in a tethered root are written contiguously to avoid

inadvertent separation by automated methods. Since tethered

roots are comprised of qualifier and qualified roots, following a

general convention of root-based construction of English

language words, we use their intrinsic qualifier-qualified

relationship to order their roots.

b) A root may appear in more than one tethered root.

Tethered roots may also provide a way to avoid the use of stop words

in a compounded root. That is, move the word from the right of a stop

word to the left, drop the stop word, and place the qualifier before the

qualified.

Forming terms from roots: Terms are formed by concatenating two or

more roots, super roots or tethered roots using a hyphen (-) so that

automated methods may re-generate their roots when necessary. We

suggest to order roots of a term by classifier-classified relationships

(See Rule 6) which is also a general convention in English, as in

police dog or technical paper unless there is a different well accepted

convention.

Avoiding ambiguities and redundancies

a) Avoid using ambiguous acronyms. Instead clarify their meaning

by qualifying them with a classifier ‘root’ to form a super root or

a tethered root or use the complete phrase.

b) Avoid the inclusion of redundant words in a term.

Ordering roots in a term — classifier-classified rule: Roots (super root,

tethered root and root) within a term are organized by a left to right,

semantic top-down, classifier-classified hierarchy. In general,

classifier and classified roots are expected to have one-to-many

relationships where, in a rules-based approach, for example, the root

alloy is a classifier for many materials. Rule 16 deals with instances

where a relationship is not obvious or when a relationship changes

over time due to the addition of new terms. In short, a hierarchy is not

absolute but rather, it varies with the number of relevant use-cases.

a) One way to identify classifier and classified roots in a term is to

arrange the terms with an embedded hierarchical top-down, level-

based classifier (for each ‘classifier’ term there exists several

possibilities of ‘classified’ terms) statement with a hyphen

between classifier and classified terms (Cs

MODELINGSOFTWARE-VASP, MODELINGSOFTWARE-ABINIT). On

Washington Academy of Sciences

sorting these terms, classified roots appear as the fast varying

strings (VASP, ABINIT) and their classifier roots appear as the

slow varying term (MODELINGSOFTWARE). Automated methods

may use this feature to develop hierarchical data models that can

be presented as data-graphs or RDF or used for auto-complete to

select terms for reliable search results.

b) When a classifier-classified relationship does not exist among the

roots, place them in an alphabetical order.

Creating roots and terms with similar, multiple, or complex meanings:

Following rule 1, use a shorter root for words with similar meaning

whenever possible. A root embedded in a term can help automated

methods, such as co-word analysis, natural language processing, and

text-mining, to identify related semantic classes. To facilitate this

process, it is recommended: a) to limit the use of synonymous roots;

b) if necessary, clarify the semantics of a root by appending it with a

classifier-root.

Reusing terms to create compounded terms: Create terms by

combining roots so that terms have clear semantics. Avoid terms that

are broad and general in meaning. Create terms that can serve as

‘semantic expressions’ in use-cases. A rule of thumb is to attempt to

form terms with three roots and, if needed, combine between two and

five terms to form suitable semantic expressions.

Creating compounded terms that identify a group of objects in the

terminology: Compound terms serve as ‘use-cases’ defining semantic

expressions of terms and they are formed by concatenating two or

more terms using a colon (:) as a special delimiting character.

Compounded terms that are overly specific are unlikely to be reused.

It is advised to limit the number of terms in a compounded term to

between two and five terms. Compounded terms may point to

persistent identifiers (PIDs), such as DOIs (Digital Object Identifier)

for query purposes. Compounded terms may be used by database

providers or repository administrators to cluster, identify, and display

related items using messages like ‘related to items that you have

viewed’.

a) Use classifier-classified hierarchical Rule 6 to decide the order of

terms in a compounded term.

b) When creating compounded terms, give importance to ‘use case-

on-demand’ hierarchies, which are case-based rather than fixed

Winter 2018

10.

IPA

14.

lS,

16.

schema-based hierarchies. Order a term so that a term to the left

has one-to-many relationship with the term to its right.

Providing the reference of any paper that supports the use of the new

term(s) you are creating. The reference may serve as a ‘definition’ of

the term as well may demonstrate use of the term within a context.

Design for readability of compounded terms: Use uppercase for the

first letter of a term and use lowercase for all the rest unless a root is

a short form or a symbol.

Provide usage statistics for terms: For each term in a database or

repository, store its usage statistics for users to inspect, along with the

terms. These frequencies may allow a user to avoid terms that are used

infrequently.

Provide semantic context of terms and compounded terms: In the

database, also keep and display a bibliographic reference and/or DOI

to illustrate the use and semantics of the term. This reference may also

be used as the basis to build use-case-specific compounded terms or

segments of data-graphs.

Identify new terms introduced by users as well as flag terms if no

documentation is provided. (See Rule 10)

Allow the creation of dialects: Terms that do not follow the rules may

also be created as local dialects when necessary. Dialects may

facilitate a gradual evolution of rule-based terminology and the rules

in a crowd-sourced environment.

Curate and validate terminology and compounded terms on a regular

basis: Dialects are important components of the proposed method for

terminology building. Therefore, accepting or removing dialects as

terminology must be facilitated by public resource providers who act

as caretakers. Redefining super roots, tethered roots and classifier-

classified relationships among roots are all important steps of the

evolution process of the proposed term building effort. Database

developers and repository administrators need to have an established

mechanism for regular updates to support a smooth evolution process.

Frequency of usage and the semantic context of terms are useful

factors to monitor in such an evolution process.

Apply new technologies that have been adopted widely: Explore

whether new data technologies may require the rules to be updated.

Washington Academy of Sciences

Appendix B: Noun Phrase Syntax and Semantics

One of the key components of any automated terminology

generation system is to find and represent important concepts. Typically,

the source of information that leads to these representations is a series of

natural language texts, such as a corpus of scientific articles. For our root-

and rule-based approach, representing these natural language descriptions

is dependent on an effective model of semantics, which we discuss in this

chapter. A complete model of natural language semantics (i.e. a perfect

representation of meaning) is most likely beyond the scope of contemporary

linguistics and computer science, so we will focus on a limited subset of

semantics — namely, the compositional semantics of noun phrases. For the

purposes of this paper, we consider a noun phrase to consist of a noun in

addition to all of its modifiers and any determiners (such as an, the, this,

and numbers). For example, the green vase is a noun phrase consisting of

an article (the), a modifier (the adjective green), and a noun (vase). Some

of the theory discussed in this section will apply to verb phrases and other

syntactic categories, but noun phrases are ideal in that they are relatively

concise and in that they are often clear representations of key concepts and

commonly appear as headwords or phrases in technical glossaries.

A very simple model of noun phrase semantics would be to simply

represent words as themselves or their lemma-forms. For example, the

words /eaf and leaves could both be represented as LEAF. For more complex

noun phrases, such as green leaf, this method becomes more problematic.

We could create a representation such as GREENLEAF, a term used

specifically for green leaves, but this model does not unambiguously reveal

that the meaning of GREENLEAF is related to the meanings of GREEN and

LEAF. For this reason, we also model the syntax of noun phrases. Syntax can

be represented in several ways, but two of the most common methods are

phrase structure grammar and dependency grammar. These models show

the relationships between words in a phrase, revealing for example that the

word green in the phrase green leaf is an adjective modifying the noun /eaf.

Syntax and compositional semantics are related concepts; the interpretation

of a phrase is derived partially from its syntactic structure (i.e. the

organization of the words in a phrase or sentence) (Montague 1988), and so

using syntax as a proxy for semantics is reasonable.

Though syntax may relate to semantics, it is not a perfect stand-in.

There is not a one-to-one relationship between syntax and semantics —

multiple syntactic forms can have the same semantic meaning, and the same

syntactic form can have multiple meanings. Because of this, deriving

Winter 2018

meaning purely from the way that words are put together can lead to

ambiguities. Consider the noun phrases in examples | through 4; though

these sentences have different syntactic structures and are composed of

different words, they have very similar meanings.

1) the tree’s red leaves

2) the tree’s leaves that are red

3) the leaves of the tree that are red

4) the red leaves of the tree

If we represent all of these phrases as the unordered set of

“important” words in each sentence, all of these sentences could be

represented as {tree, red, leaves}. However, unwanted phrases will also be

represented this way; for example, the red tree’s leaves will also be

represented as {tree, red, leaves}. Clearly some syntactic information is

necessary in order to create a sufficiently discriminating system. A simple

syntactic model is too discriminating; the words in these four sentences are

ordered differently, are contained within different syntactic constituents,

and are different in grammatical form (tree occurs both in its base form and

in its possessive form tree's). Thus, we need to develop a model of syntax

and semantics that normalizes these differences while still separating them

from other phrases. We will begin by discussing how common models of

syntax, such as dependency grammar, can be used as a basis for generating

structured terms.

B.1 Dependency Grammar

Modern Dependency Grammar (DG) was first described in Tesniére

(1959), and has since become one of the primary syntactic models used in

computational linguistics. The key concept in DG is, of course,

dependency, which is a one-to-one correspondence between morphemes in

which every morpheme is headed by some other morpheme. For the

purposes of this paper, we define a morpheme as the smallest unit of

language that has meaning, including simple unbound words such as /eaf as

well as affixes such as the -s in trees. The head of a phrase carries that

phrase’s syntactic category (i.e. the phrase the green vase behaves as a noun

and is considered a noun phrase because it is headed by vase, which is a

noun on its own). In DG, the head of a sentence is the verb, and all other

morphemes are either direct or indirect dependencies of the main verb. The

dependencies of a morpheme are those headed by it. For example, in the

phrase the green vase, vase is the head of green and the, so the dependencies

Washington Academy of Sciences

of vase are green and the. If green had its own dependencies, these would

be indirect dependencies of vase.

DG can be represented graphically as a tree structure with the verb

as the parent node and dependencies represented as daughter nodes.

However, since we are dealing exclusively with noun phrases, we will use

a noun as the parent and adjectives, determiners, relative clauses, and other

nouns as dependencies’.

The four noun phrases from examples (1) through (2) are

represented as dependency trees in Figures 10 through 13. Leaves plays the

role of the head because it is the primary (most general) concept represented

by the phrase

leaves

the of that

a *

tree are

(lait ari

the red

Figure 10: Dependency representation of “the leaves of the tree that are

red"

These trees represent both syntactic dependency and linear word

order. Each node in the tree represents a word; the dependencies of that

word are all daughter nodes. The root, at the top of the tree, is the head and

is not a dependency of any other words within the context of these phrasal

trees (namely, it does not qualify any other words). The word order can be

recovered through an in-order traversal (begin with the left-most node and

continue to the right).

‘i According to Hudson (2004), among others, phrases such as “the tree’s leaves” and “the green vase” are

actually determiner phrases and not noun phrases, and are headed by determiners such as “the” or “a”. Though there

is significant evidence for this, and most modern generative syntacticians prefer this analysis, we continue to use the

noun phrase analysis throughout this paper due to its more intuitive simplicity and for its continued prevalence in

computational linguistics. Furthermore, because we will end up ignoring determiners (see Appendix A.3), the

distinction between determiner phrase and noun phrase analyses does not have an effect on the model we describe

here.

Winter 2018

leaves

the red of

tree

the

Figure 11; Dependency representation of “the read leaves of the tree”

leaf

ery

tree red

Figure 12: Nomralized dependency representation of “the tree’s red leaf”

As noted above, the syntactic and lexical information depicted by

these dependency trees is not sufficient to show that these four noun phrases

are similar enough to represent the same concept. Structurally speaking,

these trees are very different; only the root node (/eaves) has the same

position in every tree, and the remaining nodes are positioned almost

everywhere in the tree. This suggests that we cannot use DG on its own to

show how closely these sentences are related. However, in the following

two sections we will show how we can adapt DG to construct a model of

semantics that can be used to normalize noun phrases.

leaf

Lie ae

tree red

Figure 13: Normalized dependency representation of “the leaves of the tree

that are red”

B.2 The Semantics of Dependency

Since we are trying to build a model of semantics, it is necessary for

us to delve deeper into syntactic dependency in order to uncover the

underlying semantics of noun phrases.

Polguere and Mel’cuk (2009) describe syntactic and semantic

dependency as separate, but related concepts. They describes syntactic

Washington Academy of Sciences

dependency as the bridge between the semantic form of a sentence (its

meaning) and its morphological form (the linear string representation of

morphemes that is spoken or written). That is, though syntactic

dependencies have additional structure and complexity not apparent in

semantic dependencies, the two structures are related. We may, in fact, be

able to use the syntactic structure of a phrase to estimate its semantic

structure, with additional understanding of the relationship between these

two theoretical entities.

Semantics can be described (to a certain extent) using predicate logic

(Montague 1988). In terms of dependencies, a predicate’s dependents are

its arguments. These arguments may themselves be predicates with

additional dependents, allowing for recursive structures and complex

sentences (Polguere and Mel’éuk 2009). This is an imperfect model for our

purposes because it does not precisely correspond to syntactic dependency

(which is, computationally speaking, easier to derive) and because it divides

all morphemes into relationships (predicates) and entities (arguments). In

semantics, it is unclear whether words such as a should be treated as

predicates, arguments, or something else — in some cases, for example, they

are treated as quantifiers (cf. (Montague 1988)). However, Polguere and

Mel’Céuk (2009) state that all words in a particular phrase must be connected

in a semantic dependency structure, which allows for a more consistent

representation.

However, despite the disparity between syntactic and semantic

dependency described above, there are some commonalities that are

important to the development of the model described in this paper. One of

the functions of a predicate is to add specificity to its argument(s). In this

sense, we can finally observe a clear similarity between syntactic

dependencies and semantics: the dependencies of a morpheme almost

always add specificity to the meaning of that morpheme (green /eaf is more

specific than just /eaf).

Consider Figure 11. The phrase represented by this structure has the

meaning “leaves” on a very general level. On a more specific level, it is

clear that the leaves in question are possessed objects and that they are red.

The possession relationship is made more specific by the dependencies of

the ’s morpheme — the tree is the possessor of the leaf, and the tree is made

more specific through the article the, which shows that the tree in question

is a specific tree in the common ground between the speaker and the listener

(or between the writer and reader). We will refer to this as a semantic

Winter 2018

specification relationship, because the parent node is made more specific

through its daughters.

Note that unlike semantic dependency as described in Polguere and

Mel’éuk (2009), the specificity relationship always travels downward in the

tree, and perfectly matches syntactic dependency. Semantic specificity is an

ideal semantic model for our system, despite the fact that we still cannot

directly explain the similarities between the four noun phrases in examples

1 through 4, since the trees in Figures 10 and 11 are still different. In order

to explain these similarities, we need to normalize our representations.

B.3 Normalizing Similar Structures

The primary differences between the four example phrases we have

been discussing so far are found in the presence or absence of function

morphemes — that is, grammatical units (such as the possessive morpheme

’s) that do not correspond to any real-world meaning, but instead serve

primarily to indicate grammatical relationships. Though these morphemes

do affect the semantics of the overall phrase, they are primarily used to

allow for different word-orders and slightly nuanced meanings. However,

we can recover most of the meaning of a noun phrase without any of the

function morphemes.

We create the four “collapsed” trees in Figure 14 through Figure 17

by removing all of the function morphemes, and leaving only content

morphemes (grammatical units that do correspond to real-world meaning,

1.e. Some particular concept, action, or trait).

leaf

tree red

Figure 14: Collapsed representation of the tree’s red leaves

leaf

2 Ye

red tree

Figure 15; Collapsed representation of the tree’s leaves that are red

Washington Academy of Sciences

leaf

red tree

Figure 16: Collapsed representation of the leaves of the tree that are red

leaf

vas

red tree

Figure 17: Collapsed representation of the red leaves of the tree

At this point, the similarities between these four structures are quite

clear: each one has /eaf (the head of the noun phrase) as the root, and two

dependencies: the words red and tree, albeit in different positions.

However, in our semantic model, linear order does not impact the meaning

of the noun phrase as a whole (only vertical order does). Thus, all four of

these structures produce the same meaning, namely that of a leaf specified

by both tree and red. Note that there is a trade-off to removing function

words: we do not know what type of relationship there is between /eaf and

tree, only that a relationship exists, or that it is a generic relationship such

as type-of (e.g. a tree-type-of-leaf). Relationships such as synonymy (words

that are synonyms), meronymy (words that represent a part of another

concept), and possession are not detected. However, we accept this trade-

off for our purposes: we are looking for important concepts, not specific

referents.

B.4 Representing the Model

The model of semantics described above can be represented

graphically using tree structures such as those above, in Figure 11 through

Figure 14. To do this in such a way that semantically similar phrases are all

represented the same way, we need a consistent way to order daughter

nodes. The method that we choose is largely irrelevant, so long as the order

of daughter nodes is independent of the phrase’s original word order. A

trivial method is to order the nodes alphabetically; this is what we will do

for now, meaning that sentences | through 4 can all be represented using

Figure 18.

Winter 2018

leaf

red tree

Figure 18: Universal representation for sentences I through 4)

However, this representation is not ideal. From a computational

standpoint, it is acceptable: it represents unambiguously all of the

information we need to know about a noun phrase, including the semantic

information that we want for generating terminologies. These structures can

be generated quickly and accurately using dependency parsers (Nivre

2003). However, they are not easily read by humans without linguistic

training. Reading a dependency tree requires an understanding of what a

dependency is, as well as how dependency trees are structured. For domain

taxonomies, data structures that are not human readable are undesirable.

Terminologies need to be read by humans as well as by machines, adding

an additional level of challenge to the problem of generating domain

terminologies.

We propose to solve this problem by building structured

compound nouns in such a way that they represent syntactic dependency

unambiguously while remaining human-readable. Linguistically speaking,

a compound noun is a noun composed of two or more other nouns, such as

dog house or airplane. A structured compound noun is a compound noun

formed through the application of regular, systematic rules. In English, the

way that two compound nouns are formed is not completely predictable.

Though dog house and bird house refer to houses for dogs and birds,

respectively, a fire house is not a house for fire in the same sense. Other

languages, however, have more productive compounding, meaning the

same way of creating a compound will have the same meaning in all cases.

In Sanskrit and German, for example, compound nouns often (but not

always) have predictable meanings based off of their components and how

they are combined. In other words, there is a set of rules that determines the

meaning of a compound. We can adapt this idea to our semantic model in

order to create an easy to read representation that follows from patterns in

natural language. Because structured compound nouns are structured

representations of phrasal semantics, they can be used to represent terms in

a terminology. This produces a terminology of structured terms with

predictable meanings based on roots and a set of rules used to combine

them.

Washington Academy of Sciences

Our root- and rule-based approach does not use the same rules or

patterns as Sanskrit, German, or other languages with agglutinative noun

compounds, as the rules in these language still have many of the issues

associated with natural language more generally, including ambiguities and

inconsistencies. However, the processes of compounding and composition

play a major role in root- and rule-based terminologies.

Bios

T. N. Bhat is a project leader at NIST and one of his recent goals is to

develop tools and techniques to enable archiving, searching and sharing

scientific information.

Jacob Collard is a PhD student in computational linguistics at Cornell

University., where he specializes in interpretable computational models of

natural language syntax and semantics and the application of formal

methods to natural language processing. Since 2014 he has also been

working with the National Institute of Standards and Technology on

information retrieval using formal linguistic methods together with

conventional natural language processing tools.

Eswaran Subrahmanian is a Research Professor at the Engineering

Research Accelerator and Engineering and Public Policy at Carnegie

Mellon University CMU) and a Guest researcher at the Software and

systems Division at National Institute of Standards and Technology. He is

a member of the Design Society and the Washington Academy of

Sciences, a Distinguished Scientist of the ACM and Fellow of AAAS.

John T Elliott is the group leader of Cell Systems Science Group at

NIST. He is currently developing quantitative microscopy techniques for

measuring cellular response in a variety of applications.

Ursula R Kattner is a project leader at the Thermodynamics and Kinetics

Group at NIST. Some of her current research interests are Computational

thermodynamics, Alloy phase diagram evaluations, Metal-hydrogen

systems, Solder alloy systems, Superalloy systems.

Carelyn E. Campbell is the group leader of Thermodynamics and

Kinetics Group at NIST. Some of the projects she is currently leading are:

Winter 2018

Development of informatic tools and repositories for phase-based

materials property data, including thermodynamics, diffusion, molar

volume, and elastic properties.

Ram D. Sriram is currently the chief of the Software and Systems

Division, Information Technology Laboratory, at the National Institute of

Standards and Technology. Before joining the Software and Systems

Division at NIST, he was on the engineering faculty (1986-1994) at the

Massachusetts Institute of Technology (MIT). and was instrumental in

setting up the Intelligent Engineering Systems Laboratory.

Ira Monarch has investigated information design and process issues in

large-scale engineering programs, both military and industrial, for over

thirty years. At the Software Engineering Institute (SEI), he led and

participated in projects that developed and used text analytic tools

for uncovering patterns and identifying risks and failure conditions in

software design, architecture, development and maintenance.

Washington Academy of Sciences

1200 New York Avenue

Rm G119

Washington, DC 20005

Please fill in the blanks and send your application to the address above. We will

contact you as soon as your application has been reviewed by the Membership

Committee. Thank you for your interest in the Washington Academy of Sciences.

(Dr. Mrs. Mr. Ms)

Business Address

Home Address

Phone

Cell Phone

preferred mailing address Type of membership

Business Home Regular Student

Schools of Higher Education attended Degrees

Present Occupation or Professional Position

Please list memberships in scientific societies — include office held

Winter 2018

oS)

Wi;

Instructions to Authors

Deadlines for quarterly submissions are:

Spring — February | Fall — August 1

Summer — May 1 Winter — November |

Draft Manuscripts using a word processing program (such as

MSWord), not PDF. We do not accept PDF manuscripts.

Papers should be 6,000 words or fewer. If there are 7 or more graphics,

reduce the number of words by 500 for each graphic.

Include an abstract of 150-200 words.

Include a two to three sentence bio of the authors.

Graphics must be in greytone, and be easily resizable by the editors to

fit the Journal’s page size. Reference the graphic in the text.

Use endnotes or footnotes. The bibliography may be in a style

considered standard for the discipline or professional field represented

by the paper.

Submit papers as email attachments to the editor or to

wasjournal(Qwashacadsc1.org .

Include the author’s name, affiliation, and contact information —

including postal address. Membership in an Academy-affiliated society

may also be noted. It is not required.

. Manuscripts are peer reviewed and become the property of the

Washington Academy of Sciences.

There are no page charges.

Manuscripts can be accepted by any of the Board of Discipline Editors.

Washington Academy of Sciences

Affiliated Institutions

National Institute for Standards & Technology (NIST)

Meadowlark Botanical Gardens

The John W. Kluge Center of the Library of Congress

Potomac Overlook Regional Park

Koshland Science Museum

American Registry of Pathology

Living Oceans Foundation

National Rural Electric Cooperative Association (NRECA)

Winter 2018

Membership List

AKSYUK, VLADIMIR A. 605 Gatestone Mews, Gaithersburg MD 20878 (F)

ANTMAN, STUART University of Maryland, 2309 Mathematics Building, College Park MD

20742-4015 (EF)

APPETITI, EMANUELA Botany Center, The Huntington, 1151 Oxford Road, San Marino CA

91108 (LM)

APPLE, DAINA DRAVNIEKS PO Box 905, Benicia Cal 94510-0905 (M)

ARSEM, COLLINS 3144 Gracefield Rd Apt 117, Silver Spring MD 20904-5878 (EM)

ARVESON, PAUL T. 6902 Breezewood Terrace, Rockville MD 20852-4324 (F)

BARBOUR, LARRY L. Pequest Valley Farm, 585 Townsbury Rd, Great Meadows NJ 07838 (M)

BARWICK, W. ALLEN 13620 Maidstone Lane, Potomac MD 20854-1008 (F)

BECKER, EDWIN D. 339 Springvale Road, Great Falls Va 22066 (EF)

BEHLING, NORIKO 6517 Deidre Terrace, McLean VA 22101 (M)

BERLEANT, DANIEL 12473 Rivercrest Dr., Little Rock AR 72212 (M)

BERRY, JESSE F. 2601 Oakenshield Drive, Rockville MD 20854 (M)

BIONDO, SAMUEL J. 10144 Nightingale St., Gaithersburg MD 20882 (EF)

BOISVERT, RONALD F. Mail Stop 8910, National Institute of Standards and Technology (NIST),

100 Bureau Drive Gaithersburg MD 20899-8910 (F)

BOSSE, ANGELIQUE P 11700 Stonewood Lane, Rockville MD 20852 (F)

BRADY, KATHIE 4539 Metropolitan Court, Frederick MD 21704 (M)

BRISKMAN, ROBERT D. 61 Valerian Court, North Bethesda MD 20852 (EF)

BROWN, ELISE A.B. 6811 Nesbitt Place, Mclean VA 22101-2133 (LF)

BUFORD, MARILYN P.O. Box 171, Pattison TX 77466 (EF)

BULLARD, JEFFREY WAYNE 11 Marquis Drive, Gaithersburg MD 20878 (F)

BYRD, GENE GILBERT Box 1326, Tuscaloosa AL 35403 (M)

CAVINATO, TIZIANA FCC, 7932 Opossumtown Pike, Frederick MD 21702 (M)

CIORNEIU, BORIS 20069 Great Falls Forest Dr., Great Falls VA 22066 (M)

CLINE, THOMAS LYTTON 13708 Sherwood Forest Drive, Silver Spring MD 20904 (F)

COBLE, MICHAEL NIST, 100 Bureau Drive, MS 8314, Gaithersburg MD 20899-8314 (F)

COFFEY, TIMOTHY P. 976 Spencer Rd., McLean VA 22102 (F)

COLE, JAMES H. 9709 Katie Leigh Ct, Great Falls VA 22066-3800 (F)

COUSIN, CAROLYN E. 1903 Roxburg Court, Adelphi MD 20783 (F)

CROSS, SUE 9729 Cheshire Ridge Circle, Manassas Va 20110 (M)

CUPERO, JERRI ANNE 2860 Graham Road, Falls Church VA 22042 (F)

CURRIE, S.J., CHARLES L. (Rev.) Jesuit Community, Georgetown University, Washington DC

20057 (EF)

DANNER, DAVID L. 1364, Suite 101, Beverly Road, McLean VA 22101 (F)

DAVIS, ROBERT E. 1793 Rochester Street, Crofton MD 21114 (F)

DEAN, DONNA 367 Mound Builder Loop, Hedgesville WV 25427-7211 (EF)

DEDRICK, ROBERT L. 21 Green Pond Rd, Saranac Lake NY 12983 (EF)

DHARKAR, POORVA 263 Congressional Lane Apt 412, Rockville MD 20852 (M)

DONALDSON, JOHANNA B. 3020 North Edison Street, Arlington VA 22207 (EF)

DOYLE, ELIZABETH K 6705B Overton Circle Apt. 16 Frederick MD 21703 (M)

DURRANI, SAJJAD 17513 Lafayette Dr, Olney MD 20832 (EF)

EDINGER, STANLEY EVAN Apt #1016, 5801 Nicholson Lane, North Bethesda MD 20852 (EM)

EGENREIDER, JAMES A. 1615 North Cleveland Street, Arlington VA 22201 (LF)

EPHRATH, ARYE R. 5467 Ashleigh Rd., Fairfax VA 22030 (M)

ERICKSON, TERRELL A. 4806 Cherokee St., College Park MD 20740-1865 (M)

ETTER, PAUL C. 8612 Wintergreen Court, Unit 304, Odenton MD 21113 (F)

FASANELLI, FLORENCE 4711 Davenport Street, Washington DC 20016 (EF)

FASOLKA, MICHAEL J. NIST Material Measurement Laboratory, MS8300, 100 Bureau Dr.,

Gaithersburg MD 20809 (F)

FAULKNER, JOSEPH A.2 Bay Drive, Lewes DE 19958 (EF)

Washington Academy of Sciences

FILLIBEN, JAMES JOHN NIST, 100 Bureau Dr., Stop 8980, Gaithersburg MD 20899-8980 (F)

FRASER, GERALD 5811 Cromwell Drive, Bethesda MD 20816 (M)

FREEMAN, ERNEST R. 5357 Strathmore Avenue, Kensington MD 20895-1160 (LEF)

FREHILL, LISA 1239 Vermont Ave NW #204, Washington DC 20005-3643 (M)

FROST, HOLLY C. 5740 Crownleigh Court, Burke VA 22015 (F)

GAGE, DOUGLAS W. XPM Technologies, 1020 N. Quincy Street, Apt 116, Arlington VA 22201-

4637 (M)

GARFINKEL, SIMSON L. 1186 N Utah Street, Arlington VA 22201 (M)

GAUNAURD, GUILLERMO C 4807 Macon Road, Rockville MD 20852-2348 (EF)

GHARAVI, HAMID National Institute of Standards and Technology (NIST), MS 8920,

Gaithersburg MD 20899-8920 (F)

GIBBON, JOROME 311 Pennsylvania Avenue, Falls Church VA 22046 (PF)

GIFFORD, PROSSER 59 Penzance Rd, Woods Hole MA 02543-1043 (F)

GLUCKMAN, ALBERT G. 18123 Homeland Drive, Olney MD 20832-1792 (EF)

GRAY, JOHN E. PO Box 489, Dahlgren VA 22448-0489 (M)

GRAY, MARY (Professor) Department of Mathematics, Statistics, and Computer Science,

American University, 4400 Massachusetts Avenue NW, Washington DC 20016-8050 (F)

GUIDOTTI, TEE L 2347 Ashmead PI., NW, Washington DC 20009-1413 (M)

HACK, HARVEY 176, Via Dante, Arnold MD 21012-1315 (F)

HAIG, SJ, FRANK R. (Rev.) Loyola University Maryland, 4501 North Charles St, Baltimore MD

21210-2699 (EF)

HARDIS, JONATHAN E. 356 Chestertown St., Gaithersburg MD 20878-5724 (F)

HAYNES, ELIZABETH D. 7418 Spring Village Dr., Apt. CS 422, Springfield VA 22150-4931

(EM)

HAZAN, PAUL 14528 Chesterfield Rd, Rockville MD 20853 (F)

HEANEY, JAMES B. 6 Olivewood Ct, Greenbelt MD 20770 (M)

HIETALA, RONALD 6351 Waterway Drive, Falls Church VA 22044-1322 (M)

HOFFELD, J. TERRELL 11307 Ashley Drive, Rockville MD 20852-2403 (F)

HOLLAND, PH.D., MARK A. 201 Oakdale Rd., Salisbury MD 21801 (M)

HONIG, JOHN G. 7701 Glenmore Spring Way, Bethesda MD 20817 (LF)

HORLICK, JEFFREY 8 Duvall Lane, Gaithersburg MD 20877-1838 (F)

HORN, JOANNE 1408 Grouse Court, 118 N. Market Street, Suite 201 Frederick, MD 21701,

Frederick MD 21703 (M)

HOWARD, SETHANNE Apt 311, 7570 Monarch Mills Way, Columbia MD 21046 (LF)

HOWARD-PEEBLES, PATRICIA 5701 Virginia Parkway 2312, McKinney TX 75071 (EF)

IKOSSI, KIKI 6275 Gentle LN, Alexandria VA 22310 (F)

IZADJOO, MEISAM 13137 Clarksburg Square Road, Clarksburg, MD 20871 (M)

IZADJOO, MINA 15713 Thistlebridge Drive, Rockville MD 20853 (F)

JOHNSON, EDGAR M. 1384 Mission San Carlos Drive, Amelia Island FL 32034 (LF)

JOHNSON, GEORGE P. 3614 34th Street, N.W., Washington DC 20008 (EF)

JOHNSON, JEAN M. 3614 34th Street, N.W., Washington DC 20008 (EF)

JONG, SHUNG-CHANG 8892 Whitechurch Ct, Bristow VA 20136 (LF)

KAHN, ROBERT E. 909 Lynton Place, Mclean VA 22102 (F)

KAPETANAKOS, C.A. 4431 MacArthur Blvd, Washington DC 20007 (EF)

KARAM, LISA 8105 Plum Creek Drive, Gaithersburg MD 20882-4446 (F)

KEISER, BERNHARD E. 2046 Carrhill Road, Vienna VA 22181-2917 (LF)

KLINGSBERG, CYRUS Apt. L184, 500 E. Marylyn Ave, State College PA 16801-6225 (EF)

KLOPFENSTEIN, REX C. 4224 Worcester Dr., Fairfax VA 22032-1140 (LF)

KOWTHA, VIJAYANAND 8009 Craddock Road, Greenbelt MD 20770 (F)

KRUEGER, GERALD P. Krueger Ergonomics Consultants, 4105 Komes Court, Alexandria VA

22306-1252 (EF)

LABOV, JAY B. Keck Center Room 638, 500 Fifth Street, NW, Washington DC 20001 (F)

LAWSON, ROGER H. 10613 Steamboat Landing, Columbia MD 21044 (EF)

LEIBOWITZ, LAWRENCE M. 2905 Saintsbury Place, #217, Fairfax VA 22031-1164 (LF)

Winter 2018

LEMKIN, PETER 148 Keeneland Circle, North Potomac MD 20878 (EM)

LESHUK, RICHARD 9004 Paddock Lane, Potomac MD 20854 (M)

LEWIS, DAVID C. 27 Bolling Circle, Palmyra VA 22963 (F)

LIBELO, LOUIS F. 9413 Bulls Run Parkway, Bethesda MD 20817 (LF)

LIDDLE, J ALEXANDER NIST, MS 6203, 100 Bureau Drive, Gaithersburg MD 20899-6200 (F)

LOCASCIO, LAURIE E National Institute of Standards and Technology, MS 1000, Gaithersburg

MD 20899 (F)

LONDON, MARILYN 3520 Nimitz Rd, Kensington MD 20895 (F)

LONGSTRETH, III, WALLACE I 8709 Humming Bird Court, Laurel MD 207231254 (EM)

LOOMIS, TOM H. W. 11502 Allview Dr., Beltsville MD 20705 (EM)

LOZIER, DANIEL W 5230 Sherier Place NW, Washington DC 20016 (F)

LUTZ, ROBERT J. 6031 Willow Glen Dr, Wilmington NC 28412 (EF)

LYONS, JOHN W. 7430 Woodville Road, Mt. Airy MD 21771 (EF)

MADHAVAN, GURUPRASAD 440 L St NW, Unit 1111, Washington DC 20001 (F)

MALCOM, SHIRLEY M. 12901 Wexford Park, Clarksville MD 21029-1401 (F)

MANDERSCHEID, RONALD W. 10837 Admirals Way, Potomac MD 20854-1232 (LF)

MANI, MAHESH 210 Summit Hall Rd, Gaithersburg MD 20877 (M)

MATHER, JOHN 3400 Rosemary Lane, Hyattsville MD 20782 (F)

MCGRATTAN, KEVIN B. 11512 Brandy Hall Lane, Gaithersburg MD 20878 (F)

MCNEELY, CONNIE L. School of Public Policy, George Mason University, 3351 Fairfax Dr Stop

3B1, Arlington VA 22201 (M)

MENZER, ROBERT E. 90 Highpoint Dr, Gulf Breeze FL 32561-4014 (EF)

MESSINA, CARLA G. 9800 Marquette Drive, Bethesda MD 20817 (EF)

METAILIE, GEORGES C. 18 Rue Liancourt, 75014 Paris , FRANCE (F)

MIGLER, KALMAN B. NIST, 100 Bureau Drive, Stop 8542, Gaithersburg, MD 20899 (F)

MILLER, JAY H. 8924 Ridge Place, Bethesda MD 20817-3364 (M)

MILLER H, ROBERT D. The Catholic University of America, 10918 Dresden Drive, Beltsville

MD 20705 (M)

MORRIS, JOSEPH PO Box 3005, Oakton VA 22124-9005 (M)

MORRIS, P.E., ALAN 4550 N. Park Ave. #104, Chevy Chase MD 20815 (EF)

MOUNTAIN, RAYMOND D.701 King Farm Blvd #327, Rockville MD 20850 (F)

MUELLER, TROY J. 42476 Londontown Terrace, South Riding Va 20152 (M)

MUMMA, MICHAEL J. 210 Glen Oban Drive, Arnold MD 21012 (F)

MURDOCH, WALLACE P. 65 Magaw Avenue, Carlisle PA 17015 (EF)

NEUBAUER, WERNER G. Apt 349, 7820 Walking Horse Circle, Germantown TN 38138 (EF)

NOE, ADRIANNE 9504 Colesville Road, Silver Spring MD 20901 (F)

O'HARE, JOHN J. 108 Rutland Blvd, West Palm Beach FL 33405-5057 (EF)

OHRINGER, LEE 5014 Rodman Road, Bethesda MD 20816 (EF)

OTT, WILLIAM R 19125 N. Pike Creek Place, Montgomery Village MD 20886 (EF)

PARR, ALBERT C 2656 SW Eastwood Avenue, Gresham OR 97080-9477 (F)

PAULONIS, JOHN J P.O. Box 703, Mohegan Lake NY 10547 (M)

PAZ, ELVIRA L. 172 Cook Hill Road, Wallingford CT 06492 (LEF)

PERSILY, ANDREW K NIST, Mailstop 8630, 100 Bureau Drive, Gaithersburg MD 20899 (F)

PICKHOLTZ, RAYMOND L. 3613 Glenbrook Road, Fairfax VA 22031-3210 (EF)

PLESNIAK, MICHAEL W. 1400 Laurel Dr., Accokeek MD 20607 (F)

POLAVARAPU, MURTY 10416 Hunter Ridge Dr., Oakton VA 22124 (LF)

POLINSKI, ROMUALD Prof, Doctor of Sciences (Economics), Ul. Generala Bora 39/87, 03-982

WARSZAWA 131 , Poland (M)

PRZYTYCKI, JOZEF H. (Prof.) 10005 Broad St, Bethesda MD 20814 (F)

PYKE, JR, THOMAS N. 4887 N. 35th Road, Arlington VA 22207 (EF)

RANSOM, BARBARA 3117 8th North, Arlington VA 22201 (M)

REGLI, WILLIAM Department of Computer Science, Institute for Systems Research, Clark School

of Engineering, 2173 A.V. Williams Building, 8223 Paint Branch Drive, University of

Maryland, College Park MD 20742 (F)

Washington Academy of Sciences

REISCHAUER, ROBERT 5509 Mohican Rd, Bethesda MD 20816 (EF)

RICKER, RICHARD 12809 Talley Ln, Darnestown MD 20878-6108 (F)

RIDGELL, MARY P.O. Box 133, 48073 Mattapany Road, St. Mary's City MD 20686-0133 (LM)

ROBERTS, SUSAN Ocean Studies Board, Keck 607, National Research Council, 500 Fifth Street,

NW, Washington DC 20001 (F)

ROGERS, KENNETH 355 Fellowship Circle, Gaithersburg MD 20877 (LM)

ROOD, SALLY A PO Box 12093, Arlington VA 22219 (F)

ROSENBLATT, JOAN R. 701 King Farm Blvd, Apt 630, Rockville MD 20850 (EF)

SANDERS, JAY 7850 Westmont Lane, McLean VA 22102 (F)

SAUBERMAN, P.E., HARRY R 8810 Sandy Ridge Ct., Fairfax VA 22031 (M)

SCHMEIDLER, NEAL F. 7218 Hadlow Drive, Springfield VA 22152 (F)

SELKIRK, WILLIAM 2423 Wynfield Ct, Frederick MD 21702 (M)

SENKEVITCH, EMILEE 1015 Columbine Drive, Apt 2B, Frederick MD 21701 (M)

SERPAN, CHARLES Z5510 Bradley Blvd, Bethesda MD 20814 (EM)

SEVERINSKY, ALEX J. 4707 Foxhall Cres NW, Washington DC 20007-1064 (EM)

SHAFRIN, ELAINE G. 8100 Connecticut Ave NW Apt 1014, Washington DC 20815-2817 (EF)

SHROPSHIRE, JR, W. Apt. 426, 300 Westminster Canterbury Dr., Winchester VA 22603 (LF)

SIMMS, JAMES ROBERT (Mr.) 9405 Elizabeth Ct., Fulton MD 20759 (M)

SLUZKI, CARLOS 5302 Sherier P! NW, Washington DC 20016 (F)

SMITH, THOMAS E 3148 Gracefield Rd Apt 215, Silver Spring MD 20904-5863 (LF)

SNIECKUS, MARY 1700, Dublin Dr., Silver Spring MD 20902 (F)

SODERBERG, DAVID L. 403 West Side Dr. Apt. 102, Gaithersburg MD 20878 (EM)

SOLAND, RICHARD M. 2516 Arizona Av Apt 6, Santa Monica CA 90404-1426 (LF)

SPARGO, WILLIAM J. 9610 Cedar Lane, Bethesda MD 20814 (F)

STAVELEY, JUDY 880 Laval Drive, Sykesville MD 21784 (M)

STERN, KURT H. 103 Grant Avenue, Takoma Park MD 20912-4328 (EF)

STIEF, LOUIS J. 332 N St., SW., Washington DC 20024-2904 (EF)

STILES, MARK D. 11506 Taber Street, Silver Spring MD 20902 (F)

STOMBLER, ROBIN Auburn Health Strategies, 3519 South Four Mile Run Dr., Arlington VA

22206 (M)

SUBRAHMANIAN, ESWARAN 4740 Connecticut Avenue, Apt #815, Washington DC 20008

(LM)

TEICH, ALBERT H. PO Box 309, Garrett Park MD 20896 (EF)

THEOFANOS, MARY FRANCES 7241 Antares Drive, Gaithersburg MD 20879 (M)

THOMPSON, CHRISTIAN F. 278 Palm Island Way, Ponte Vedra FL 32081 (LF)

TIMASHEV, SVIATOSLAV (SLAVA) A. 3306 Potterton Dr., Falls Church VA 22044-1603 (F)

TORAIN HI, DAVID S 1313 Summerfield Drive, Herndon VA 20170 (M)

TOUWAIDE, ALAIN Botany Center, The Huntington, 1151 Oxford Road, San Marino CA 91108

(LF)

TROXLER, G.W. PO Box 1144, Chincoteague VA 23336-9144 (F)

UBELAKER, DOUGLAS H. Dept. of Anthropology, National Museum of Natural History,

Smithsonian Institution, Washington DC 20560-0112 (F)

UMPLEBY, STUART (Professor) Apt 1207, 4141 N Henderson Rd, Arlington VA 22203 (F)

VARADI, PETER F. Apartment 1606W, 4620 North Park Avenue, Chevy Chase MD 20815-7507

(EF)

VAVRICK, DANIEL J. 10314 Kupperton Court, Fredricksburg VA 22408 (F)

VOAS, JEFFREY 8210 Crestwood Heights Drive, Apartment 720, McLean VA 22102 (M)

VOORHEES, ELLEN 100 Bureau Dr., Stop 8940, Gaithersburg MD 20899-8940 (F)

WALDMANN, THOMAS A. 3910 Rickover Road, Silver Spring MD 20902 (F)

WANG, Y. CLAIRE 140 Charles Street, Apt 22D, New York NY 10014 (M)

WEBB, RALPH E. 21-P Ridge Road, Greenbelt MD 20770 (EF)

WEISS, ARMAND B. 6516 Truman Lane, Falls Church VA 22043 (LF)

WERGIN, WILLIAM P. | Arch Place #322, Gaithersburg MD 20878 (EF)

WHITE, CARTER 12160 Forest Hill Rd, Waynesboro PA 17268 (EF)

Winter 2018

WIESE, WOLFGANG L. 8229 Stone Trail Drive, Bethesda MD 20817 (EF)

WILLIAMS, CARL 2272 Dunster Lane, Potomac MD 29854 (F)

WILLIAMS, E. EUGENE Dept. of Biological Sciences, Salisbury University, 1101 Camden Ave,

Salisbury MD 21801 (M)

WILLIAMS, JACK 6022 Hardwick Place, Falls Church VA 22041 (F)

Washington Academy of Sciences

ed ee ee ee

Winter 2018

Delegates to the Washington Academy of Sciences

Representing Affiliated Scientific Societies

Acoustical Society of America

American/International Association of Dental Research

American Assoc. of Physics Teachers, Chesapeake

Section

American Astronomical Society

American Fisheries Society

American Institute of Aeronautics and Astronautics

American Institute of Mining, Metallurgy & Exploration

American Meteorological Society

American Nuclear Society

American Phytopathological Society

American Society for Cybernetics

American Society for Microbiology

American Society of Civil Engineers

American Society of Mechanical Engineers

American Society of Plant Physiology

Anthropological Society of Washington

ASM International

Association for Women in Science

Association for Computing Machinery

Association for Science, Technology, and Innovation

Association of Information Technology Professionals

Biological Society of Washington

Botanical Society of Washington

Capital Area Food Protection Association

Chemical Society of Washington

District of Columbia Institute of Chemists

District of Columbia Psychology Association

Eastern Sociological Society

Electrochemical Society

Entomological Society of Washington

Geological Society of Washington

Historical Society of Washington DC

Human Factors and Ergonomics Society

(continued on next page)

Paul Arveson

J. Terrell Hoffeld

Frank R. Haig, S. J.

Sethanne Howard

Lee Benaka

David W. Brandt

E. Lee Bray

Vacant

Charles Martin

Vacant

Stuart Umpleby

Vacant

Daniel J. Vavrick

Mark Holland

Vacant

Toni Marechaux

Jodi Wesemann

Vacant

F. Douglas

Witherspoon

Vacant

Chris Puttock

Keith Lempel

Vacant

Ronald W.

Mandersheid

Vacant

Jurate Landwehr

Vacant

Gerald Krueger

Washington Academy of Sciences

Delegates to the Washington Academy of Sciences

Representing Affiliated Scientific Societies

(continued from previous page)

Institute of Electrical and Electronics Engineers, Washington

Section

Institute of Food Technologies, Washington DC Section

Institute of Industrial Engineers, National Capital Chapter

International Association for Dental Research, American

Section

International Society for the Systems Sciences

International Society of Automation, Baltimore Washington

Section

Instrument Society of America

Marine Technology Society

Maryland Native Plant Society

Mathematical Association of America, Maryland-District of

Columbia-Virginia Section

Medical Society of the District of Columbia

National Capital Area Skeptics

National Capital Astronomers

National Geographic Society

Optical Society of America, National Capital Section

Pest Science Society of America

Philosophical Society of Washington

Society for Experimental Biology and Medicine

Society of American Foresters, National Capital Society

Society of American Military Engineers, Washington DC

Post

Society of Manufacturing Engineers, Washington DC

Chapter

Society of Mining, Metallurgy, and Exploration, Inc.,

Washington DC Section

Soil and Water Conservation Society, National Capital

Chapter

Technology Transfer Society, Washington Area Chapter

Virginia Native Plant Society, Potowmack Chapter

Washington DC Chapter of the Institute for Operations

Research and the Management Sciences (WINFORMS)

Washington Evolutionary Systems Society

Washington History of Science Club

Washington Paint Technology Group

Washington Society of Engineers

Washington Society for the History of Medicine

Washington Statistical Society

World Future Society, National Capital Region Chapter

Richard Hill

Taylor Wallace

Neal F. Schmeidler

Christopher Fox

Vacant

Richard

Sommerfield

Hank Hegner

Jake Sobin

Vacant

John Hamman

Julian Craig

Vacant

Jay H. Miller

Vacant

Jim Heaney

Vacant

Larry S. Millstein

Vacant

Marilyn Buford

Vacant

E. Lee Bray

Erika Larsen

Richard Leshuk

Alan Ford

Meagan Pitluck-

Schmitt

Vacant

Albert G. Gluckman

Vacant

Alvin Reiner

Alain Touwaide

Michael P. Cohen

Jim Honig