There has been one thing that has shaped both my desire to major in
computer science and to continue into graduate school: my job as a
teacher's assistant.  My role as teacher's assistant over the past
three years has included not only assisting in the laboratory, but
also holding office hours, supervising up to 12 other undergraduate
teacher's assistants, grading programs, maintaining class web pages
and more.  This experience has been vital to me in several ways.  The
first is that it helped me recognize my talent for computer science.
I was able to help a lot of people, even some who had been taking
computer science for longer than I had been.  The confidence that I
gained from this also convinced me that I wanted to major in computer
science, and then continue on to graduate school.  The second
important change that being a TA made in my life is that it made me
realize the number of improvements yet to be made.  I realized that I
knew enough and had enough ideas to improve the use of computers; this
spurred me onto going to graduate school.  As a TA, I have noticed
that even people who are able to understand concepts are often unable
to find the information that they need.  Thus, I would like to improve
information retrieval so that the vast stores of data that are
available can be used to their fullest potential.  

One problem that people have with the current information systems is
that they need to know the specific terminology to get decent results
from their queries.  I would like to extend the research into
information retrieval, specifically into full text search, so that it
would also allow efficient storage of synonyms. The tradeoff between
the slow and unbalanced results of query expansion and the bulky index
of storing all of the synonyms needs to be explored further.  In
addition, I would like to make it possible for people to specify which
parts of the document that they want to have count more heavily. For
example, in HTML, something that was enclosed in an <H2> tag should
lend more weight to the assigning of the probabilities of that page
being relevant to the topic.  I would like to make it not only so that
that could be done for a specific document structure, but so that
people could specify the segments of their documents that they are
indexing that they expect to see and how much different sections
should be weighed.  I propose to do this by first studying the current
information retrieval technologies, particularly Bayesian inference
networks.  

I have had several opportunities to do research.  The first occurred
the summer after my sophomore year when I worked with Dr. Sun at Duke
University through the CRA Distributed Mentor Project.  There I
installed LAPACK, a linear algebra package, on three different
platforms and ran benchmarking tests on the machines after
installation. I completed most of that research on my own due to the
fact that Dr. Sun had to spend most of that summer in Tennessee.
While we could still communicate via e-mail, it was not nearly as much
contact as I would have had if she had been there.  I found all of the
files, figured out all of the problems with the installation, ran all
of the tests, and wrote up all of the findings.  She was available to
answer questions when I had them, but I found for myself most of the
answers I needed.  

Last summer while I was working at Microsoft Corporation I was given
the task of making it easier for people to get help on topics related
to Office 97.  The goal was to make it so that people could access the
Knowledge Base (a database that has all of the bugs and fixes that
deal with Microsoft products) with a natural language interface.  That
way they could use the same interface when looking for information on
the World Wide Web that they used when using Office 97.  During the
first few weeks of this project, I came up to speed with the prevalent
ideas about information retrieval.  I read a number of papers on
Bayesian inference networks and then applied what I had learned to my
project.  The methods used for the bulk of the project were based on
internal Microsoft ideas, and I applied what I had learned to make the
system work smoothly.  Most of my ideas addressed how to tabulate the
probabilities of the different keywords in order to form the
index. All research and work was done independently including the
method of assigning probabilities of the different keywords.  My
project should be up on the World Wide Web shortly after Office 97
reaches the shelves, and they are considering using my methods in
other projects.  

I am currently involved in an independent study which is spanning over
this semester and next.  My goal is to create a better system of
getting help from both man pages and texinfo sources.  Currently there
is not much in the way of full text search involved in man page
searches; when a user types "man -k", instead of looking through all
of the information stored within the man pages, the man program
instead looks though a one line synopsis of the program.  This makes
it difficult for a user to find the desired information because a one
line synopsis cannot possibly contain all of the things that a program
can perform; trying to contain all of the common tasks performed with
"awk" or "sed" would be next to impossible.  In addition, the lack of
synonym matching makes the limitations even more stifling.  To improve
upon this, I am looking over the current methods available, such as
Tkman, and the GNU texinfo reader, and I will then integrate the ideas
that I find with my own.  

I am also teaching myself Java, since I abhor how the different
information systems vary so wildly on different platforms.  I would
like to continue to improve upon the current research in information
retrieval during my graduate career.  In particular, focusing upon the
lack of synonym matching and the unwieldy interfaces that most
information retrieval systems have.

My career plan after graduate school is to work in academia in both
teaching and research.  I would like to enable people to get as much
use out of their computers as I do, and this is the best way I know of
to do so.