There has been one thing that has shaped both my desire to major in computer science and to continue into graduate school: my job as a teacher's assistant. My role as teacher's assistant over the past three years has included not only assisting in the laboratory, but also holding office hours, supervising up to 12 other undergraduate teacher's assistants, grading programs, maintaining class web pages and more. This experience has been vital to me in several ways. The first is that it helped me recognize my talent for computer science. I was able to help a lot of people, even some who had been taking computer science for longer than I had been. The confidence that I gained from this also convinced me that I wanted to major in computer science, and then continue on to graduate school. The second important change that being a TA made in my life is that it made me realize the number of improvements yet to be made. I realized that I knew enough and had enough ideas to improve the use of computers; this spurred me onto going to graduate school. As a TA, I have noticed that even people who are able to understand concepts are often unable to find the information that they need. Thus, I would like to improve information retrieval so that the vast stores of data that are available can be used to their fullest potential. One problem that people have with the current information systems is that they need to know the specific terminology to get decent results from their queries. I would like to extend the research into information retrieval, specifically into full text search, so that it would also allow efficient storage of synonyms. The tradeoff between the slow and unbalanced results of query expansion and the bulky index of storing all of the synonyms needs to be explored further. In addition, I would like to make it possible for people to specify which parts of the document that they want to have count more heavily. For example, in HTML, something that was enclosed in an

tag should lend more weight to the assigning of the probabilities of that page being relevant to the topic. I would like to make it not only so that that could be done for a specific document structure, but so that people could specify the segments of their documents that they are indexing that they expect to see and how much different sections should be weighed. I propose to do this by first studying the current information retrieval technologies, particularly Bayesian inference networks. I have had several opportunities to do research. The first occurred the summer after my sophomore year when I worked with Dr. Sun at Duke University through the CRA Distributed Mentor Project. There I installed LAPACK, a linear algebra package, on three different platforms and ran benchmarking tests on the machines after installation. I completed most of that research on my own due to the fact that Dr. Sun had to spend most of that summer in Tennessee. While we could still communicate via e-mail, it was not nearly as much contact as I would have had if she had been there. I found all of the files, figured out all of the problems with the installation, ran all of the tests, and wrote up all of the findings. She was available to answer questions when I had them, but I found for myself most of the answers I needed. Last summer while I was working at Microsoft Corporation I was given the task of making it easier for people to get help on topics related to Office 97. The goal was to make it so that people could access the Knowledge Base (a database that has all of the bugs and fixes that deal with Microsoft products) with a natural language interface. That way they could use the same interface when looking for information on the World Wide Web that they used when using Office 97. During the first few weeks of this project, I came up to speed with the prevalent ideas about information retrieval. I read a number of papers on Bayesian inference networks and then applied what I had learned to my project. The methods used for the bulk of the project were based on internal Microsoft ideas, and I applied what I had learned to make the system work smoothly. Most of my ideas addressed how to tabulate the probabilities of the different keywords in order to form the index. All research and work was done independently including the method of assigning probabilities of the different keywords. My project should be up on the World Wide Web shortly after Office 97 reaches the shelves, and they are considering using my methods in other projects. I am currently involved in an independent study which is spanning over this semester and next. My goal is to create a better system of getting help from both man pages and texinfo sources. Currently there is not much in the way of full text search involved in man page searches; when a user types "man -k", instead of looking through all of the information stored within the man pages, the man program instead looks though a one line synopsis of the program. This makes it difficult for a user to find the desired information because a one line synopsis cannot possibly contain all of the things that a program can perform; trying to contain all of the common tasks performed with "awk" or "sed" would be next to impossible. In addition, the lack of synonym matching makes the limitations even more stifling. To improve upon this, I am looking over the current methods available, such as Tkman, and the GNU texinfo reader, and I will then integrate the ideas that I find with my own. I am also teaching myself Java, since I abhor how the different information systems vary so wildly on different platforms. I would like to continue to improve upon the current research in information retrieval during my graduate career. In particular, focusing upon the lack of synonym matching and the unwieldy interfaces that most information retrieval systems have. My career plan after graduate school is to work in academia in both teaching and research. I would like to enable people to get as much use out of their computers as I do, and this is the best way I know of to do so.