AB
1
NameDescription
2
(generateData_<Name>.m files)
3
3DGenerates points scattered randomly around a linear 2D subspace of 3D
4
3DQuadGenerates noisy samples from a quadratic surface
5
4gridGenerates noisy labelled samples from 4 classes that are separable into 4 quadrants
6
5gridGenerates labelled samples from 5 classes that are separable into 5 regions of a grid
7
binaryGenerates samples from two classes with a horizontal boundary
8
circularPoints in one class surround the other class
9
clusteringGenerates samples from a randomly-chosen mixture of bivariate Gaussians
10
clustersXonlyGenerates 2D sample of 5 tightly clustered groups of points on the plane, but no target labels
11
consGroupsTwo distinct groups of constant-mean points centered at -0.5 and 0.5 respectively, separated by a discontinuity at 0
12
constantPoints are a noisy sample from a straight line centered at the origin
13
curvedPoints are in each class are separable with a curved boundary
14
GaussPoints follow noisy univariate Gaussian distribution
15
gridMultiGenerates noisy sample from 5 classes that live in 5 separate regions of the graph, but with substantial overlap, with target labels
16
groupsPoints in each class are generated from orthogonal lines, forming an X with much overlap due to added noise
17
irrelevFeaturesGenerate 10-D model matrix where most features are highly correlated but uninformative
18
linearPoints are a noise sample from a line with constant slope
19
linGroupsPoints generated from -abs(x) under noise over [-1,1]
20
multiLabelGenerates 2D points with multiple labels represented as decimal numbers for binary label memberships. E.g., if 5 classes exists, and point x_1 belongs to classes 1 and 3 , its label is the binary number 00101 which is stored as the decimal number 5. Note: multilabel MLP function converts decimal representation back to binary representation for input to the network.
21
multiXonlyGenerates labelled dataset with two clusters centered at (3,3) and (-3,-3) respectively
22
ordinal_1DGenerates 2D where class labels have an order imposed along along a line
23
ordinal_2DGenerates a 2D set of points with labels that have an order imposed according to a radial distance of the points from a center in the plane
24
outliersXonlyGenerates samples from a bivariate standard normal distribution, with a few outliers samples from a bivariate normal with mean (5,5)
25
ovalXonlyGenerates a a 2D sample from an oval region centered at the origin
26
quadPoints are a noisy sample from a quadratic function of inputs
27
rareGenerates data of one majority class with a relatively rare number of points of a second class
28
robustnessGenerates points that are nearly linearly separable except for a few far outliers
29
sigmoidPoints follow noisy sigmoid shape
30
slantedPoints in each class are generated on either side of a linear boundary, with some overlap due to added noise
31
spreadOut90% points are scattered around a line with constant slope, and the remaining 10% are scattered widely away from the line with constant slope
32
swissRollGenerates a 3D dataset that resembles a plane curling inwards, intendended for manifold learning
33
vertPoints in each class are generate as two groups, stacked on top of each other with some overlap due to added noise
34
volatileGenerates a noisy sample from a stepwise function of sigmoid, Gaussian, and quadratic components
35
(<Name>.dat files)
36
animalsContains classification features for various animals, intended for clustering
37
citiesContains quality of life ratings for major U.S. metropolitan areas
38
data_exponentialContains 2D poitns of two classes that are nearly linearly separable in a standard linear basis
39
outliersDataContains 2D data exhibiting a strongly linear relationship but with numerous outliers
40
regressOnOneContains 2D data with constant variance across the range of x exhibiting a linear relationship between the dependent and independent variables