HAL < BETA < TWiki

Tags: view all tags
---+ Feature Milestones
---++ HAL 1.0
_target: September, 2010_
---+++ Web UI Features
   * Page to add new external target algorithms
   * Page to add new parameter spaces for a given target algorithm (modified from existing spaces)
   * Page to add new problem instances/distributions (in the form of lists of files)
   * Page to specify new execution environments (Eg. cluster config details)
   * Pages to specify & launch included meta-algorithms
   * Ability to view algorithms/instances by problem (instance compatibility) during above specification
   * Page to view summary of all queued, running, and completed jobs
   * Page to view browse/view details/delete runs/problems/instances/algorithms/environments
   * Dynamic run monitoring analysis pages, including:
   * Plots: Overlaid SCDs for (fixed #) multi-alg, multi inst meta-algs (RTDs for single-inst), SQT for meta-algs where possible, scatter plot for 2-target multi-instance meta-algs, incumbent SCD/RTD for design meta-algs.  *done but being reworked*
   * Descriptive statistics: (mean/sd, quantiles/iqrs) for assessing single-algorithm on an instance dist *done*
   * Statistical tests: Wilcoxon signed rank, Spearman correlation for comparing 2 algs on an instance dist *done*

---+++ Functionality for meta-algorithm developers
   * Ability to interact with the parameter space of an algorithm (examine domains, conditionalities, etc.) *done*
   * Ability to transform algorithm parameter spaces:  log transforms, discretization *done*
   * Ability to run arbitrary algorithms, including other meta-algorithms, in identical fashion *done*
   * Ability to monitor the trajectories of all output variables of an executed algorithm run, in real time *done*
   * Ability to query database of previous runs directly *done*
   * Ability to access instance features *done*
   * Pre-defined metrics for aggregating performance across runs *done*

---+++ Backend functionality exposed in above
   * Ability to execute algorithms locally *done*
   * Ability to execute algorithms on a remote host via SSH *needs update re: API changes*
   * Ability to execute algorithms on a SGE cluster *needs update re: object API changes*
   * Ability to actively monitor remotely running algorithms via RPC *needs update re: object API changes*
   * MySQL database storing records of all algorithms, instances, runs, etc. *done*
   * SQLite database fallback if MySQL unavailable *done*
   * R interface for performing statistical tests, etc. *done*

---+++ Meta-Algorithms Included
   * Configuration procedure: ParamILS (external) *in progress*
   * Configuration procedure: ROAR (internal) *done; will need minor updates to work with backend redesign*
   * Analysis procedure: Paired algorithm comparison *in progress*
   * Analysis procedure: Single-algorithm analysis *in progress*

---+++ Distribution Issues
   * Documentation
   * Detection/configuration of external dependencies (c.f. UI/execution environment specification)
   * Double-click-to-run universal JAR distribution


---++ HAL 1.1
_target: December, 2010_

---+++ Web UI Features
   * Ability to export complete experiment packages (including algorithms, instances, run instructions)
   * Ability to load and execute an experiment package
   * Ability to "chain" experiments (eg. design procs. followed by analysis proc comparing incumbents)
   
---+++ Functionality for meta-algorithm developers
   * Random Forest classification + regression models, incl. interface accepting AlgorithmRun objects for training and inference
   * support for feature extraction procedures
   
---+++ Backend functionality
   * Support for TORQUE clusters
   * Support for "bag-of-machines" execution manager

---+++ Meta-Algorithms Included
   * Configuration procedure: ActiveConfigurator (internal)
   * Multi-algorithm comparison
   * SATzilla-like portfolio builder
   * Parallelized AC
   * ParamILS (internal)

---++ HAL 1.x
_target: 2011_
   * libraries of:
      * search/optimization procedures
      * machine learning tools
   * multi-algorithm comparisons
   * scaling analyses
   * bootstrapped analyses
   * robustness analyses
   * parameter response analyses
   * Parallel portfolios in HAL
   * Iterated F-Race in HAL
   * support for optimization/Monte-Carlo experiments
   * support instance generators
   * support for instance format converters
   * Support text-file inputs and outputs for external algorithms (now is only cmd line, and stdin/err)
   * array jobs in SGE
   * Wider support for working directory requirements of individual algorithm runs, e.g. Concorde's creation of 20 files with fixed names.


---++ Unprioritized Features
_new feature requests should be initially added here; notify a HAL developer and come to a HAL meeting if you feel your feature must move up the stack quickly_
   * (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances _CN: can hopefully be implemented as a chained experiment_
   * (FH) Developers of configurators should be able to swap in new versions of a configurator _CN: 
   * (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator
   * (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it)
   * (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances") _CN: this is what is being implemented in the ongoing backend redesign_
   * (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
   * (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example).
   * (HH) Significance-gated analysis / sequential hypothesis testing (see email from HH).
   * (CF) Continued testing to support LAMA-ish difficulties in HAL:
   * * Wallclock vs. CPU cutoff options
   * * Warnings in the dashboard if target runs or experiments are behaving "strangely"
   * * Email notifications sent to users when various events happen   
   * (CF) Restricted data/execution/targetalgs for the demo server
   * (CF) Selection of performance metric _before_ selecting the configurator to use. What is the exact problem specification for configuration?
   * (CN) convenience methods in MetaAlgorithm hiding next(), hasNext(), report() from the 3rd-party developer; instead providing an interface like AlgorithmRun fetchRun(Algorithm a), with no InterruptedException; implies an AlgorithmRun class that can adaptively switch between a "queued" and a "running" implementation for before and after the true environment fetchRun(...) call is made/returns.
   * (HH) Service-oriented volunteer computing. See, e.g., "Service-Oriented Volunteer Computing for Massively Parallel Constraint Solving Using Portfolios", Zeynep Kiziltan and Jacopo Mauro, in CPAIOR-2010 proceedings.
   * (KLB) Handle network issues (e.g. loss of connection to datamanager, etc.) robustly.  Restart runs, etc., as required to ensure that the originally-requested job ultimately completes correctly with as little babysitting by the user as possible.
   * (FH) Normalization transform, in addition to existing log transform

---+ Active work items
---++ Frontend
---+++ Release-critical 
   * *CF* algorithm specification screen: implement (includes initial design space specification) _(CF): In Progress_
   * *CF* left side of landing page:  task selection/presentation according to pattern concept _(CF): In Progress_
   * *CF* experiment specification and monitor screens from a pattern template, and procedure-specific requirements, including experiment and incubment naming
   * *CF* instance specification screen: implement _(CF): In Progress_
   * *CF* Execution environment specification (incl. R, Gnuplot, java locations) _(CF): In Progress_
   * RTDs/per-target-algorithm-run monitoring and navigation
   * design space specification by revision of existing spaces
   * Merge with backend refactor (when done)

---+++ Important
   * Data management interface:
      * deleting runs/expts/etc.
      * data export
   * Error logging/handling/browsing
   * Plotting ex-gnuplot
   * Documentation as a header on most of the experiment pages, paragraph explaining the intention etc.
   * Hiding "advanced" settings, such as configurator-specific settings or other tools, with appropriate defaults.

---++ Backend
---+++ Release-critical
   * *CN* Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces.  Both DB and Java object model; requires Algorithm refactor below. *done*
   * *CN* Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). *done*
   * *CN* Refactor code to align class hierarchy with terminology of paper _(CN: done for all but meta-algorithm implementations, which are in progress)_
   * *CN* Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above *done*
   * *CN* Database schema -- speed-related refactor *done* _(may want further tuning)_
   * *CN* Refactor SSH & RPC execution managers to work under refactor

---+++ Important
   * *CN* Connection pooling *done*
   * Caching analysis results _(CN: in progress as part of meta-alg changes above)_
   * *CN* Query optimization *done* _(may want more depending on real-world observations)_
   * Selective limitation of run-level archiving (dynamic based on runtime?)
   * add incumbentname semantic input to (design) procedures
   * instance features

---+++ Nice-to-have
   * *CN* DataManager API refinement _(in progress as part of DataManager refactor)_
   * *CF* N-way performance comparison
   * Stale connection issue; incl. robustness to general network issues
   * *CN* Read-only DataManager connection for use by individual MA procedures *done*
   * Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc.  Also for different "versions" (without reuse) corresponding to added funcitonality.
   * Ability to quantify membership of configurations to different design spaces *done*


---++ Application: ActiveConfigurator
---+++ Release Critical
   * *VC* ROAR in Java *in testing*
   * *VC* Calling Matlab from Java *in testing*
   * *CN* parameter transformations (log, discretization, etc.) *done*
   * *VC* SMBO, calling Matlab for model building/evaluation  _(VC: implemented, in testing)_
   * Adapt Weka RF implementation for regression
   * Pure-Java SMBO implementation
   * Merge Java AC with refactored HAL codebase once refactor is completed
   * Adapt standalone Java AC to work as "internal" HAL meta-algorithm


---++ Support/QA/Misc.
---+++ Release Critical
   * unit testing: parameters (domains) *OK*
   * unit testing: parameter spaces *OK*
   * unit testing: algorithms
   * unit testing: execution managers (local, SSH, cluster)
   * unit testing: data managers (SQLite, MySQL)
   * unit testing: meta-algorithms
   * functional testing:  full pipeline
   * Licensing issues (GPL'd components...)

---+++ Important
   * *CN* Git, not CVS *done*
   * *CN* Order+configure new DB server _(CN: waiting for Dave B to make final changeover)_
   * user-facing documentation (help)
   * *CN* Better logging/error-reporting (to console/within HAL).  eg:*done* _(for most cases; exceptions are auto-logged)_
   * *CN* *JX* *VC* Basic Windows support *done, in testing*
   * Better handling of overhead runtime vs. target algorithm runtime

---+++ Nice-to-have
   * developer-facing documentation (javadocs) _(in progress in parallel with other work)_

 
---+ Bug Reports
   * (CN) JSC test reliability issue (compared to R)
   * (CN) end-of-experiment hanging bug (GGA, multinode cluster runs)
   * (LX) missing current-time point in solution quality trace, so don't see the final "flat line"
   * (CN) accuracy of mid-run overhead accounting for PILS/GGA
   * (CF) Configuration file callstrings with weird spaces, i.e. "... -param '$val$ blah' ..." where '$val blah' needs to be passed to the target as a single argument. _(CN) does this work with double-quotes instead of single-quotes?_
   * (JS) FixedConfigurationExperiment UI is outdated, unusable.
   * (JS) HAL is not usable on WestGrid. We need a TorqueClusterExecutionManager.
   * (JS) Algorithms with a requirement of a new directory for each run.
   * (JS) one of the ExecutionManagers produces unstarted AlgorithmRuns
   * (FH) If a HAL slave process fails to start, the associated expt. status stays on "queued" forever
   * (FH) Database table contention causes locking and high query latency. Likely to be fixed by database changes and use of InnoDB, but I'm reporting it anyway.
   * (CN) DataManager-decorated ExecutionManager still requires explicit commit to save results.  Also run results cannot be saved unless explicitly associated with an experiment id.
   * (CN) Parameter values (eg Instance files) with spaces are split during command string construction; need to enquote them as necessary.
   * (CN) Form input not validates _moved from feature requests_
   * (MC) After error: java.io.IOException: Cannot run program "gnuplot" (in directory "gnuplotData"): java.io.IOException: error=2, No such file or directory,
     experiment cannot be aborted.
Raw edit | More topic actions
Topic revision: r45 - 2011-01-05 - mavc