Short-term

Target: CRC/initial release

Frontend

Release-critical

functionality promised in paper
  • CF algorithm specification screen: implement (includes initial design space specification) (CF): In Progress
  • CF left side of landing page: task selection/presentation according to pattern concept (CF): In Progress
  • CF experiment specification and monitor screens from a pattern template, and procedure-specific requirements, including experiment and incubment naming
  • CF instance specification screen: implement (CF): In Progress
  • CF Execution environment specification (incl. R, Gnuplot, java locations) (CF): In Progress
  • RTDs/per-target-algorithm-run monitoring and navigation
  • design space specification by revision of existing spaces

Important

works as-is but end-user experience significantly impacted
  • Data management interface:
    • deleting runs/expts/etc.
    • data export
  • Error logging/handling/browsing
  • Plotting ex-gnuplot
  • Documentation as a header on most of the experiment pages, paragraph explaining the intention etc.
  • Hiding "advanced" settings, such as configurator-specific settings or other tools, with appropriate defaults.

Backend

Release-critical

for functionality mentioned in paper for which post-release changes would be problematic
  • CN Named instance set table done
  • CN Named configuration table done
  • CN Execution environment table. (CF): Reopened to account for java/ruby/gnuplot location specification etc.; mostly done but not checked in
  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. (CN: done for Java objects; not started for DB)
  • CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). (CN: in progress)
  • CN rename objects to match paper terminology done
  • CN Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper (CN: in progress)
  • CN Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above done
  • CN Database schema -- speed-related refactor

Important

mostly to (substantially) improve UI responsiveness
  • Connection pooling
  • Caching analysis results
  • Query optimization
  • Selective limitation of run-level archiving (dynamic based on runtime?)
  • add incumbentname semantic input to (design) procedures

Nice-to-have

noticeable mostly to developer-users
  • DataManager API refinement
  • CF N-way performance comparison first-cut for Frank.
  • Stale connection issue; incl. robustness to general network issues
  • Read-only DataManager connection for use by individual MA procedures
  • Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc. Also for different "versions" (without reuse) corresponding to added funcitonality.
  • Ability to quantify membership of configurations to different design spaces

Support/QA/Misc.

Release Critical

  • more unittests; also functional/integration tests

Important

  • user-facing documentation (help)
  • Better logging/error-reporting (to console/within HAL). eg: log4j
  • Better handling of overhead runtime vs. target algorithm runtime

Nice-to-have

  • developer-facing documentation (javadocs)

Medium-term

Planned for future HAL 1.x revisions

  • Packaging/bundling complete experiments or other HAL primitives for easy reproduction or installation by other users.
  • Windows support
  • libraries of:
    • search/optimization procedures
    • machine learning tools
  • multi-algorithm comparisons
  • scaling analyses
  • bootstrapped analyses
  • robustness analyses
  • parameter response analyses
  • SATzilla in HAL
  • ParamILS in HAL
  • Parallel portfolios in HAL
  • ActiveConfigurator in HAL
  • Iterated F-Race in HAL
  • chained-procedure experiments
  • support for optimization/Monte-Carlo experiments
  • support instance generators
  • Git, not CVS
  • Support text-file inputs and outputs for external algorithms
  • Instance features
  • Explicit representation of problems (e.g. particular instance formats)
  • Experiments calling experiments, not just external target algs
  • array jobs in SGE
  • Hashing everything, including instances, instance sets and configurations.
  • Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names.
  • Validation of form input.
  • Scriptable submission of experiments. (CF): Accelerated for Frank, finished 18/05/2010.
  • Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration.

Long-term/Unprioritized

Feature requests should be initially added here
  • (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances
  • (FH) Developers of configurators should be able to swap in new versions of a configurator
  • (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator
  • (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it)
  • (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances")
  • (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
  • (JS) public static AlgorithmRun subclasses in most ExecutionManagers should probably be private
  • (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example).
  • (HH) Significance-gated analysis / sequential hypothesis testing (see email from HH).
  • (CF) Continued testing to support LAMA-ish difficulties in HAL:
  • * Wallclock vs. CPU cutoff options
  • * Warnings in the dashboard if target runs or experiments are behaving "strangely"
  • * Email notifications sent to users when various events happen
  • (CF) Restricted data/execution/targetalgs for the demo server
  • (CN) Support of performance metrics
  • (CF) Selection of performance metric before selecting the configurator to use. What is the exact problem specification for configuration?
  • (CN) convenience methods in MetaAlgorithm hiding next(), hasNext(), report() from the 3rd-party developer; instead providing an interface like AlgorithmRun fetchRun(Algorithm a), with no InterruptedException; implies an AlgorithmRun class that can adaptively switch between a "queued" and a "running" implementation for before and after the true environment fetchRun(...) call is made/returns.

Bug Reports

  • (CN) JSC test reliability issue (compared to R)
  • (CN) end-of-experiment hanging bug (GGA, multinode cluster runs)
  • (JS) InnoDB SQL errors (CN): fixed 11/05/10
  • (LX) missing current-time point in solution quality trace, so don't see the final "flat line"
  • (CN) accuracy of mid-run overhead accounting for PILS/GGA
  • (CF) Configuration file callstrings with weird spaces, i.e. "... -param '$val$ blah' ..." where '$val blah' needs to be passed to the target as a single argument. (CN) does this work with double-quotes instead of single-quotes?
  • (JS) FixedConfigurationExperiment UI is outdated, unusable.
  • (JS) HAL is not usable on WestGrid. We need a TorqueClusterExecutionManager.
  • (JS) Algorithms with a requirement of a new directory for each run.
  • (JS) one of the ExecutionManagers produces unstarted AlgorithmRuns
  • (CF) When HAL kills a target algorithm run, it does not also kill all child processes spawned by that run. This can leave zombies and all kinds of other very bad things after a period of time. (CN): fixed 18/05/10
  • (FH) If a HAL slave process fails to start, the associated expt. status stays on "queued" forever
  • (FH) Database table contention causes locking and high query latency. Likely to be fixed by database changes and use of InnoDB, but I'm reporting it anyway.
  • (CN) DataManager-decorated ExecutionManager still requires explicit commit to save results. Also run results cannot be saved unless explicitly associated with an experiment id.
  • (CN) Parameter values (eg Instance files) with spaces are split during command string construction; need to enquote them as necessary.
Edit | Attach | Watch | Print version | History: r45 | r37 < r36 < r35 < r34 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r35 - 2010-06-14 - ChrisNell
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback