Tags:
tag this topic
create new tag
view all tags
---+ Feature Milestones ---++ HAL 1.0 _target: September, 2010_ ---+++ Web UI Features * Page to add new external target algorithms * Page to add new parameter spaces for a given target algorithm (modified from existing spaces) * Page to add new problem instances/distributions (in the form of lists of files) * Page to specify new execution environments (Eg. cluster config details) * Pages to specify & launch included meta-algorithms * Ability to view algorithms/instances by problem (instance compatibility) during above specification * Page to view summary of all queued, running, and completed jobs * Page to view browse/view details/delete runs/problems/instances/algorithms/environments * Dynamic run monitoring analysis pages, including: * Plots: Overlaid SCDs for (fixed #) multi-alg, multi inst meta-algs (RTDs for single-inst), SQT for meta-algs where possible, scatter plot for 2-target multi-instance meta-algs, incumbent SCD/RTD for design meta-algs. *done but being reworked* * Descriptive statistics: (mean/sd, quantiles/iqrs) for assessing single-algorithm on an instance dist *done* * Statistical tests: Wilcoxon signed rank, Spearman correlation for comparing 2 algs on an instance dist *done* ---+++ Functionality for meta-algorithm developers * Ability to interact with the parameter space of an algorithm (examine domains, conditionalities, etc.) *done* * Ability to transform algorithm parameter spaces: log transforms, discretization *done* * Ability to run arbitrary algorithms, including other meta-algorithms, in identical fashion *done* * Ability to monitor the trajectories of all output variables of an executed algorithm run, in real time *done* * Ability to query database of previous runs directly *done* * Ability to access instance features *done* * Pre-defined metrics for aggregating performance across runs *done* ---+++ Backend functionality exposed in above * Ability to execute algorithms locally *done* * Ability to execute algorithms on a remote host via SSH *needs update re: API changes* * Ability to execute algorithms on a SGE cluster *needs update re: object API changes* * Ability to actively monitor remotely running algorithms via RPC *needs update re: object API changes* * MySQL database storing records of all algorithms, instances, runs, etc. *done* * SQLite database fallback if MySQL unavailable *done* * R interface for performing statistical tests, etc. *done* ---+++ Meta-Algorithms Included * Configuration procedure: ParamILS (external) *in progress* * Configuration procedure: ROAR (internal) *done; will need minor updates to work with backend redesign* * Analysis procedure: Paired algorithm comparison *in progress* * Analysis procedure: Single-algorithm analysis *in progress* ---+++ Distribution Issues * Documentation * Detection/configuration of external dependencies (c.f. UI/execution environment specification) * Double-click-to-run universal JAR distribution ---++ HAL 1.1 _target: December, 2010_ ---+++ Web UI Features * Ability to export complete experiment packages (including algorithms, instances, run instructions) * Ability to load and execute an experiment package * Ability to "chain" experiments (eg. design procs. followed by analysis proc comparing incumbents) ---+++ Functionality for meta-algorithm developers * Random Forest classification + regression models, incl. interface accepting AlgorithmRun objects for training and inference * support for feature extraction procedures ---+++ Backend functionality * Support for TORQUE clusters * Support for "bag-of-machines" execution manager ---+++ Meta-Algorithms Included * Configuration procedure: ActiveConfigurator (internal) * Multi-algorithm comparison * SATzilla-like portfolio builder * Parallelized AC * ParamILS (internal) ---++ HAL 1.x _target: 2011_ * libraries of: * search/optimization procedures * machine learning tools * multi-algorithm comparisons * scaling analyses * bootstrapped analyses * robustness analyses * parameter response analyses * Parallel portfolios in HAL * Iterated F-Race in HAL * support for optimization/Monte-Carlo experiments * support instance generators * support for instance format converters * Support text-file inputs and outputs for external algorithms (now is only cmd line, and stdin/err) * array jobs in SGE * Wider support for working directory requirements of individual algorithm runs, e.g. Concorde's creation of 20 files with fixed names. ---++ Unprioritized Features _new feature requests should be initially added here; notify a HAL developer and come to a HAL meeting if you feel your feature must move up the stack quickly_ * (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances _CN: can hopefully be implemented as a chained experiment_ * (FH) Developers of configurators should be able to swap in new versions of a configurator _CN: * (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator * (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it) * (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances") _CN: this is what is being implemented in the ongoing backend redesign_ * (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH * (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example). * (HH) Significance-gated analysis / sequential hypothesis testing (see email from HH). * (CF) Continued testing to support LAMA-ish difficulties in HAL: * * Wallclock vs. CPU cutoff options * * Warnings in the dashboard if target runs or experiments are behaving "strangely" * * Email notifications sent to users when various events happen * (CF) Restricted data/execution/targetalgs for the demo server * (CF) Selection of performance metric _before_ selecting the configurator to use. What is the exact problem specification for configuration? * (CN) convenience methods in MetaAlgorithm hiding next(), hasNext(), report() from the 3rd-party developer; instead providing an interface like AlgorithmRun fetchRun(Algorithm a), with no InterruptedException; implies an AlgorithmRun class that can adaptively switch between a "queued" and a "running" implementation for before and after the true environment fetchRun(...) call is made/returns. * (HH) Service-oriented volunteer computing. See, e.g., "Service-Oriented Volunteer Computing for Massively Parallel Constraint Solving Using Portfolios", Zeynep Kiziltan and Jacopo Mauro, in CPAIOR-2010 proceedings. * (KLB) Handle network issues (e.g. loss of connection to datamanager, etc.) robustly. Restart runs, etc., as required to ensure that the originally-requested job ultimately completes correctly with as little babysitting by the user as possible. * (FH) Normalization transform, in addition to existing log transform ---+ Active work items ---++ Frontend ---+++ Release-critical * *CF* algorithm specification screen: implement (includes initial design space specification) _(CF): In Progress_ * *CF* left side of landing page: task selection/presentation according to pattern concept _(CF): In Progress_ * *CF* experiment specification and monitor screens from a pattern template, and procedure-specific requirements, including experiment and incubment naming * *CF* instance specification screen: implement _(CF): In Progress_ * *CF* Execution environment specification (incl. R, Gnuplot, java locations) _(CF): In Progress_ * RTDs/per-target-algorithm-run monitoring and navigation * design space specification by revision of existing spaces * Merge with backend refactor (when done) ---+++ Important * Data management interface: * deleting runs/expts/etc. * data export * Error logging/handling/browsing * Plotting ex-gnuplot * Documentation as a header on most of the experiment pages, paragraph explaining the intention etc. * Hiding "advanced" settings, such as configurator-specific settings or other tools, with appropriate defaults. ---++ Backend ---+++ Release-critical * *CN* Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. *done* * *CN* Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). *done* * *CN* Refactor code to align class hierarchy with terminology of paper _(CN: done for all but meta-algorithm implementations, which are in progress)_ * *CN* Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above *done* * *CN* Database schema -- speed-related refactor *done* _(may want further tuning)_ * *CN* Refactor SSH & RPC execution managers to work under refactor ---+++ Important * *CN* Connection pooling *done* * Caching analysis results _(CN: in progress as part of meta-alg changes above)_ * *CN* Query optimization *done* _(may want more depending on real-world observations)_ * Selective limitation of run-level archiving (dynamic based on runtime?) * add incumbentname semantic input to (design) procedures * instance features ---+++ Nice-to-have * *CN* DataManager API refinement _(in progress as part of DataManager refactor)_ * *CF* N-way performance comparison * Stale connection issue; incl. robustness to general network issues * *CN* Read-only DataManager connection for use by individual MA procedures *done* * Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc. Also for different "versions" (without reuse) corresponding to added funcitonality. * Ability to quantify membership of configurations to different design spaces *done* ---++ Application: ActiveConfigurator ---+++ Release Critical * *VC* ROAR in Java *in testing* * *VC* Calling Matlab from Java *in testing* * *CN* parameter transformations (log, discretization, etc.) *done* * *VC* SMBO, calling Matlab for model building/evaluation _(VC: implemented, in testing)_ * Adapt Weka RF implementation for regression * Pure-Java SMBO implementation * Merge Java AC with refactored HAL codebase once refactor is completed * Adapt standalone Java AC to work as "internal" HAL meta-algorithm ---++ Support/QA/Misc. ---+++ Release Critical * unit testing: parameters (domains) *OK* * unit testing: parameter spaces *OK* * unit testing: algorithms * unit testing: execution managers (local, SSH, cluster) * unit testing: data managers (SQLite, MySQL) * unit testing: meta-algorithms * functional testing: full pipeline * Licensing issues (GPL'd components...) ---+++ Important * *CN* Git, not CVS *done* * *CN* Order+configure new DB server _(CN: waiting for Dave B to make final changeover)_ * user-facing documentation (help) * *CN* Better logging/error-reporting (to console/within HAL). eg:*done* _(for most cases; exceptions are auto-logged)_ * *CN* *JX* *VC* Basic Windows support *done, in testing* * Better handling of overhead runtime vs. target algorithm runtime ---+++ Nice-to-have * developer-facing documentation (javadocs) _(in progress in parallel with other work)_ ---+ Bug Reports * (CN) JSC test reliability issue (compared to R) * (CN) end-of-experiment hanging bug (GGA, multinode cluster runs) * (LX) missing current-time point in solution quality trace, so don't see the final "flat line" * (CN) accuracy of mid-run overhead accounting for PILS/GGA * (CF) Configuration file callstrings with weird spaces, i.e. "... -param '$val$ blah' ..." where '$val blah' needs to be passed to the target as a single argument. _(CN) does this work with double-quotes instead of single-quotes?_ * (JS) FixedConfigurationExperiment UI is outdated, unusable. * (JS) HAL is not usable on WestGrid. We need a TorqueClusterExecutionManager. * (JS) Algorithms with a requirement of a new directory for each run. * (JS) one of the ExecutionManagers produces unstarted AlgorithmRuns * (FH) If a HAL slave process fails to start, the associated expt. status stays on "queued" forever * (FH) Database table contention causes locking and high query latency. Likely to be fixed by database changes and use of InnoDB, but I'm reporting it anyway. * (CN) DataManager-decorated ExecutionManager still requires explicit commit to save results. Also run results cannot be saved unless explicitly associated with an experiment id. * (CN) Parameter values (eg Instance files) with spaces are split during command string construction; need to enquote them as necessary. * (CN) Form input not validates _moved from feature requests_ * (MC) After error: java.io.IOException: Cannot run program "gnuplot" (in directory "gnuplotData"): java.io.IOException: error=2, No such file or directory, experiment cannot be aborted.
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r45
<
r44
<
r43
<
r42
<
r41
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r45 - 2011-01-05
-
mavc
Home
Site map
BETA web
Communications web
Faculty web
Imager web
LCI web
Main web
SPL web
Sandbox web
TWiki web
TestCases web
BETA Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Register User
E
dit
A
ttach
Copyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback