DuckyThesisProposalNotes < Main

---+ Ducky Thesis Proposal Notes
%TOC%

---++ Problem Statement
@@@ A clear statement of the problem and the research question.

The differences in productivity between programmers is very high (cite @@@).

We want to investigate work practices of highly productive programmers and less-productive programmers.  To do so, we will 
   * Recruit test subjects from students in computer science classes where all students tackle the same assignments.
   * Have the students install logging software.
   * Log interactions that developers have with a Java integrated development environment called Eclipse.  
   * Have the students submit the logs with the submitted assignments.
   * Have the instructor deliver the logs, the submissions, and the grade for the coding portion of the assignment.
   * Assign a score to each submission based on both mechanically-derived metrics (like how tangled@@@ the code is or how many unit tests it passed)and the grade.
   * Use data mining techniques to look for patterns in the data that correlate with the quality of the submissions.





---++ Literature Review
@@@ A presentation of the relevant literature and the theoretical framework.

---++ Proposed data-gathering methods
@@@ A description of the research design and instruments and data gathering methods. 

---++ Proposed analysis methods
@@@ An outline of the plan for data analysis and the rationale for the level and method chosen, applicable statistical tests and computer programs.

---
---+ Unsorted junk
   * [[http://pages.cpsc.ucalgary.ca/%7Esillito][Jonathan's paper]]

---++ Publishable papers
   * time spent vs. grade vs. metrics -- whole boatload of papers possible from that!

---++ How evaluate
   * Grades
   * [[http://findbugs.sourceforge.net/][FindBugs]]
   * Software metrics
      * [[http://portal.acm.org/citation.cfm?id=800091.802959][Third time charm: Stronger prediction of programmer performance by software complexity metrics]], 1979.  Evidence that software complexity metrics developed by Halstead and McCabe are related to the difficulty of finding bugs in the code.
      * [[http://metrics.sourceforge.net/][Eclipse metrics]]
      * Vocabulary: afferent coupling, cohesion
      * [[http://portal.acm.org/citation.cfm?id=170867&dl=ACM&coll=portal&CFID=11111111&CFTOKEN=2222222][impact of group and individ factors on prog productivity]]
      * [[http://www.cs.ucl.ac.uk/staff/A.Finkelstein/fose/finalfenton.pdf][Software MEtrics: Roadmap]]
      * [[http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=544349][space agency metrics]]
      * [[http://portal.acm.org/citation.cfm?id=100385&dl=ACM&coll=GUIDE&CFID=11111111&CFTOKEN=2222222][taxonomy for programming style]] prolly not useful
      * Table 2 lists the metrics evaluated in the study, including a short description and a reference to the definition of the metric. All of the metrics are proposed by Chidamber and Kimerer [CK94] or by Lorenz & Kidd [LK94] (?) However, we rule out some of the proposed metrics because they received serious critique in the literature (LCOM and RFC [CK94]), because the definition isn’t clear (MCX, CCO, CCP, CRE [LK94]; LCOM [CK94, EDL98]), because the lack of static typing in Smalltalk prohibits the computation of the metric (CBO [CK94]), because the metric is too similar with another metric included in the list (NIM, NCM and PIM in [LK94] resemble WMC-NOM in [CK94]), or simply because the metric is deemed inappropriate (NAC, SIX, MUI, FFU, FOC, CLM, PCM, PRC [LK94])

   * [[http://video.google.com/videoplay?docid=3198706649408822425][Vik's talk]] -- uses Timed Markov Models
   * http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=295895
   * [[http://prog.vub.ac.be/research/FFSE/Publications/MensDemeyer2001-evolmetrics.pdf][evolution metrics]]
   * http://charm.cs.uiuc.edu/papers/ProductivityPPHECatHPCA04.pdf
   * [[http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1232465][pair programming]]
   * [[http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6887][critique of cyclomatic metric]]
   * [[http://domino.research.ibm.com/tchjr/journalindex.nsf/0/b1678f986bfdd00d85256bfa00685aee?OpenDocument][IBM thing]]
   * [[http://ieeexplore.ieee.org/iel4/5594/15021/00685594.pdf?arnumber=685594][measuring team productivity]]

---++ Follow-ons
   * early students vs. later students
   * students vs. professionals
   * single vs pair-programming
   * Eclipse vs other IDEs
   * Java vs other languages

---++ Tools needed
   * Logging sw
   * visualization sw
      * something that replays the session
      * Mylog
   * data mining sw 
   * something that checks that the trace is complete -- replays the session and makes sure that replaying the trace creates the handin
   * sw for doing acceptance tests on traces
   * some tool/mechanism for organizing/collecting all the user data


---++ Need academic ref
   * [[http://www.joelonsoftware.com/articles/HighNotes.html][reference to Stanley Eisenstat yale cs 323]] time vs. outcome studies


---++ Interesting references for me to chase down
   * Cross, E. The behavioral styles of computer programmers. in Proc 8th Annual SIGCPR Conference. 1970. Maryland, WA, USA.

   * Mayer, D.B. and A.W. Stalnaker. Selection and Evaluation of Computer Personnel – the Research History of SIG/CPR. in Proc 1968 23rd ACM National Conference,. 1968. Las Vegas, NV, USA.

   * Michael McCracken, Vicki Almstrum, Danny Diaz, Mark Guzdial, Dianne Hagan, Yifat Ben- David Kolikant, Cary Laxer, Lynda Thomas, Ian Utting, and Tadeusz Wilusz. A multinational, multi-institutional study of assessment of programming skills of first-year CS students. In Working group reports from ITiCSE on Innovation and technology in computer science education, Canterbury, UK, 2001. ACM Press.

   * B Adelson and E Soloway. The role of domain experience in software design. IEEE Transactions on Software Engineering, 11(November):1351–1360, 1985.
   * Jeffrey Bonar and Elliot Soloway. Uncovering principles of novice programming. In 10th ACM POPL, pages 10–13, 1983.

and other references from [[http://www.cs.mdx.ac.uk/research/PhDArea/saeed/paper1.pdf][This Camel Has Two Humps]] and [[http://www.cs.mdx.ac.uk/research/PhDArea/saeed/S_Dehnadi_ppij-2006__2.pdf][Testing Programming Aptitude]]

 [[http://72.14.253.104/search?q=cache:5sZH4p3h9h8J:www.cis.strath.ac.uk/~linxiao/TechReport2006.doc+students+who+had+a+consistent+model+did+better+than+inconsistent+model+even+when+wrong&hl=en&gl=ca&ct=clnk&cd=9&client=firefox][follow-on to the camel]]
This topic: Main > TWikiUsers > DuckySherwood > DuckyHomework > DuckyThesisProposalNotes
Topic revision: r13 - 2006-11-08 - TWikiGuest