Tags:
create new tag
view all tags

Feature Milestones

HAL 1.0

target: September, 2010

Web UI Features

  • Page to add new external target algorithms
  • Page to add new parameter spaces for a given target algorithm (modified from existing spaces)
  • Page to add new problem instances/distributions (in the form of lists of files)
  • Page to specify new execution environments (Eg. cluster config details)
  • Pages to specify & launch included meta-algorithms
  • Ability to view algorithms/instances by problem (instance compatibility) during above specification
  • Page to view summary of all queued, running, and completed jobs
  • Page to view browse/view details/delete runs/problems/instances/algorithms/environments
  • Dynamic run monitoring analysis pages, including:
  • Plots: Overlaid SCDs for (fixed #) multi-alg, multi inst meta-algs (RTDs for single-inst), SQT for meta-algs where possible, scatter plot for 2-target multi-instance meta-algs, incumbent SCD/RTD for design meta-algs. done but being reworked
  • Descriptive statistics: (mean/sd, quantiles/iqrs) for assessing single-algorithm on an instance dist done
  • Statistical tests: Wilcoxon signed rank, Spearman correlation for comparing 2 algs on an instance dist done

Functionality for meta-algorithm developers

  • Ability to interact with the parameter space of an algorithm (examine domains, conditionalities, etc.) done
  • Ability to transform algorithm parameter spaces: log transforms, discretization done
  • Ability to run arbitrary algorithms, including other meta-algorithms, in identical fashion done
  • Ability to monitor the trajectories of all output variables of an executed algorithm run, in real time done
  • Ability to query database of previous runs directly done
  • Ability to access instance features done
  • Pre-defined metrics for aggregating performance across runs done

Backend functionality exposed in above

  • Ability to execute algorithms locally done
  • Ability to execute algorithms on a remote host via SSH needs update re: API changes
  • Ability to execute algorithms on a SGE cluster needs update re: object API changes
  • Ability to actively monitor remotely running algorithms via RPC needs update re: object API changes
  • MySQL database storing records of all algorithms, instances, runs, etc. done
  • SQLite database fallback if MySQL unavailable done
  • R interface for performing statistical tests, etc. done

Meta-Algorithms Included

  • Configuration procedure: ParamILS (external) in progress
  • Configuration procedure: ROAR (internal) done; will need minor updates to work with backend redesign
  • Analysis procedure: Paired algorithm comparison in progress
  • Analysis procedure: Single-algorithm analysis in progress

Distribution Issues

  • Documentation
  • Detection/configuration of external dependencies (c.f. UI/execution environment specification)
  • Double-click-to-run universal JAR distribution

HAL 1.1

target: December, 2010

Web UI Features

  • Ability to export complete experiment packages (including algorithms, instances, run instructions)
  • Ability to load and execute an experiment package
  • Ability to "chain" experiments (eg. design procs. followed by analysis proc comparing incumbents)

Functionality for meta-algorithm developers

  • Random Forest classification + regression models, incl. interface accepting AlgorithmRun objects for training and inference
  • support for feature extraction procedures

Backend functionality

  • Support for TORQUE clusters
  • Support for "bag-of-machines" execution manager

Meta-Algorithms Included

  • Configuration procedure: ActiveConfigurator (internal)
  • Multi-algorithm comparison
  • SATzilla-like portfolio builder
  • Parallelized AC
  • ParamILS (internal)

HAL 1.x

target: 2011
  • libraries of:
    • search/optimization procedures
    • machine learning tools
  • multi-algorithm comparisons
  • scaling analyses
  • bootstrapped analyses
  • robustness analyses
  • parameter response analyses
  • Parallel portfolios in HAL
  • Iterated F-Race in HAL
  • support for optimization/Monte-Carlo experiments
  • support instance generators
  • support for instance format converters
  • Support text-file inputs and outputs for external algorithms (now is only cmd line, and stdin/err)
  • array jobs in SGE
  • Wider support for working directory requirements of individual algorithm runs, e.g. Concorde's creation of 20 files with fixed names.

Unprioritized Features

new feature requests should be initially added here; notify a HAL developer and come to a HAL meeting if you feel your feature must move up the stack quickly
  • (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances CN: can hopefully be implemented as a chained experiment
  • (FH) Developers of configurators should be able to swap in new versions of a configurator _CN:
  • (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator
  • (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it)
  • (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances") CN: this is what is being implemented in the ongoing backend redesign
  • (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
  • (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example).
  • (HH) Significance-gated analysis / sequential hypothesis testing (see email from HH).
  • (CF) Continued testing to support LAMA-ish difficulties in HAL:
  • * Wallclock vs. CPU cutoff options
  • * Warnings in the dashboard if target runs or experiments are behaving "strangely"
  • * Email notifications sent to users when various events happen
  • (CF) Restricted data/execution/targetalgs for the demo server
  • (CF) Selection of performance metric before selecting the configurator to use. What is the exact problem specification for configuration?
  • (CN) convenience methods in MetaAlgorithm hiding next(), hasNext(), report() from the 3rd-party developer; instead providing an interface like AlgorithmRun fetchRun(Algorithm a), with no InterruptedException; implies an AlgorithmRun class that can adaptively switch between a "queued" and a "running" implementation for before and after the true environment fetchRun(...) call is made/returns.
  • (HH) Service-oriented volunteer computing. See, e.g., "Service-Oriented Volunteer Computing for Massively Parallel Constraint Solving Using Portfolios", Zeynep Kiziltan and Jacopo Mauro, in CPAIOR-2010 proceedings.
  • (KLB) Handle network issues (e.g. loss of connection to datamanager, etc.) robustly. Restart runs, etc., as required to ensure that the originally-requested job ultimately completes correctly with as little babysitting by the user as possible.
  • (FH) Normalization transform, in addition to existing log transform

Active work items

Frontend

Release-critical

  • CF algorithm specification screen: implement (includes initial design space specification) (CF): In Progress
  • CF left side of landing page: task selection/presentation according to pattern concept (CF): In Progress
  • CF experiment specification and monitor screens from a pattern template, and procedure-specific requirements, including experiment and incubment naming
  • CF instance specification screen: implement (CF): In Progress
  • CF Execution environment specification (incl. R, Gnuplot, java locations) (CF): In Progress
  • RTDs/per-target-algorithm-run monitoring and navigation
  • design space specification by revision of existing spaces
  • Merge with backend refactor (when done)

Important

  • Data management interface:
    • deleting runs/expts/etc.
    • data export
  • Error logging/handling/browsing
  • Plotting ex-gnuplot
  • Documentation as a header on most of the experiment pages, paragraph explaining the intention etc.
  • Hiding "advanced" settings, such as configurator-specific settings or other tools, with appropriate defaults.

Backend

Release-critical

  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. done
  • CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). done
  • CN Refactor code to align class hierarchy with terminology of paper (CN: done for all but meta-algorithm implementations, which are in progress)
  • CN Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above done
  • CN Database schema -- speed-related refactor done (may want further tuning)
  • CN Refactor SSH & RPC execution managers to work under refactor

Important

  • CN Connection pooling done
  • Caching analysis results (CN: in progress as part of meta-alg changes above)
  • CN Query optimization done (may want more depending on real-world observations)
  • Selective limitation of run-level archiving (dynamic based on runtime?)
  • add incumbentname semantic input to (design) procedures
  • instance features

Nice-to-have

  • CN DataManager API refinement (in progress as part of DataManager refactor)
  • CF N-way performance comparison
  • Stale connection issue; incl. robustness to general network issues
  • CN Read-only DataManager connection for use by individual MA procedures done
  • Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc. Also for different "versions" (without reuse) corresponding to added funcitonality.
  • Ability to quantify membership of configurations to different design spaces done

Application: ActiveConfigurator

Release Critical

  • VC ROAR in Java in testing
  • VC Calling Matlab from Java in testing
  • CN parameter transformations (log, discretization, etc.) done
  • VC SMBO, calling Matlab for model building/evaluation (VC: implemented, in testing)
  • Adapt Weka RF implementation for regression
  • Pure-Java SMBO implementation
  • Merge Java AC with refactored HAL codebase once refactor is completed
  • Adapt standalone Java AC to work as "internal" HAL meta-algorithm

Support/QA/Misc.

Release Critical

  • unit testing: parameters (domains) OK
  • unit testing: parameter spaces OK
  • unit testing: algorithms
  • unit testing: execution managers (local, SSH, cluster)
  • unit testing: data managers (SQLite, MySQL)
  • unit testing: meta-algorithms
  • functional testing: full pipeline
  • Licensing issues (GPL'd components...)

Important

  • CN Git, not CVS done
  • CN Order+configure new DB server (CN: waiting for Dave B to make final changeover)
  • user-facing documentation (help)
  • CN Better logging/error-reporting (to console/within HAL). eg:*done* (for most cases; exceptions are auto-logged)
  • CN JX VC Basic Windows support done, in testing
  • Better handling of overhead runtime vs. target algorithm runtime

Nice-to-have

  • developer-facing documentation (javadocs) (in progress in parallel with other work)

Bug Reports

  • (CN) JSC test reliability issue (compared to R)
  • (CN) end-of-experiment hanging bug (GGA, multinode cluster runs)
  • (LX) missing current-time point in solution quality trace, so don't see the final "flat line"
  • (CN) accuracy of mid-run overhead accounting for PILS/GGA
  • (CF) Configuration file callstrings with weird spaces, i.e. "... -param '$val$ blah' ..." where '$val blah' needs to be passed to the target as a single argument. (CN) does this work with double-quotes instead of single-quotes?
  • (JS) FixedConfigurationExperiment UI is outdated, unusable.
  • (JS) HAL is not usable on WestGrid. We need a TorqueClusterExecutionManager.
  • (JS) Algorithms with a requirement of a new directory for each run.
  • (JS) one of the ExecutionManagers produces unstarted AlgorithmRuns
  • (FH) If a HAL slave process fails to start, the associated expt. status stays on "queued" forever
  • (FH) Database table contention causes locking and high query latency. Likely to be fixed by database changes and use of InnoDB, but I'm reporting it anyway.
  • (CN) DataManager-decorated ExecutionManager still requires explicit commit to save results. Also run results cannot be saved unless explicitly associated with an experiment id.
  • (CN) Parameter values (eg Instance files) with spaces are split during command string construction; need to enquote them as necessary.
  • (CN) Form input not validates moved from feature requests
  • (MC) After error: java.io.IOException: Cannot run program "gnuplot" (in directory "gnuplotData"): java.io.IOException: error=2, No such file or directory, experiment cannot be aborted.
Edit | Attach | Watch | Print version | History: r45 < r44 < r43 < r42 < r41 | Backlinks | Raw View |  Raw edit | More topic actions
Topic revision: r45 - 2011-01-05 - mavc
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback