A | B | |
---|---|---|

1 | Name | Description |

2 | (generateData_<Name>.m files) | |

3 | 3D | Generates points scattered randomly around a linear 2D subspace of 3D |

4 | 3DQuad | Generates noisy samples from a quadratic surface |

5 | 4grid | Generates noisy labelled samples from 4 classes that are separable into 4 quadrants |

6 | 5grid | Generates labelled samples from 5 classes that are separable into 5 regions of a grid |

7 | binary | Generates samples from two classes with a horizontal boundary |

8 | circular | Points in one class surround the other class |

9 | clustering | Generates samples from a randomly-chosen mixture of bivariate Gaussians |

10 | clustersXonly | Generates 2D sample of 5 tightly clustered groups of points on the plane, but no target labels |

11 | consGroups | Two distinct groups of constant-mean points centered at -0.5 and 0.5 respectively, separated by a discontinuity at 0 |

12 | constant | Points are a noisy sample from a straight line centered at the origin |

13 | curved | Points are in each class are separable with a curved boundary |

14 | Gauss | Points follow noisy univariate Gaussian distribution |

15 | gridMulti | Generates noisy sample from 5 classes that live in 5 separate regions of the graph, but with substantial overlap, with target labels |

16 | groups | Points in each class are generated from orthogonal lines, forming an X with much overlap due to added noise |

17 | irrelevFeatures | Generate 10-D model matrix where most features are highly correlated but uninformative |

18 | linear | Points are a noise sample from a line with constant slope |

19 | linGroups | Points generated from -abs(x) under noise over [-1,1] |

20 | multiLabel | Generates 2D points with multiple labels represented as decimal numbers for binary label memberships. E.g., if 5 classes exists, and point x_1 belongs to classes 1 and 3 , its label is the binary number 00101 which is stored as the decimal number 5. Note: multilabel MLP function converts decimal representation back to binary representation for input to the network. |

21 | multiXonly | Generates labelled dataset with two clusters centered at (3,3) and (-3,-3) respectively |

22 | ordinal_1D | Generates 2D where class labels have an order imposed along along a line |

23 | ordinal_2D | Generates a 2D set of points with labels that have an order imposed according to a radial distance of the points from a center in the plane |

24 | outliersXonly | Generates samples from a bivariate standard normal distribution, with a few outliers samples from a bivariate normal with mean (5,5) |

25 | ovalXonly | Generates a a 2D sample from an oval region centered at the origin |

26 | quad | Points are a noisy sample from a quadratic function of inputs |

27 | rare | Generates data of one majority class with a relatively rare number of points of a second class |

28 | robustness | Generates points that are nearly linearly separable except for a few far outliers |

29 | sigmoid | Points follow noisy sigmoid shape |

30 | slanted | Points in each class are generated on either side of a linear boundary, with some overlap due to added noise |

31 | spreadOut | 90% points are scattered around a line with constant slope, and the remaining 10% are scattered widely away from the line with constant slope |

32 | swissRoll | Generates a 3D dataset that resembles a plane curling inwards, intendended for manifold learning |

33 | vert | Points in each class are generate as two groups, stacked on top of each other with some overlap due to added noise |

34 | volatile | Generates a noisy sample from a stepwise function of sigmoid, Gaussian, and quadratic components |

35 | (<Name>.dat files) | |

36 | animals | Contains classification features for various animals, intended for clustering |

37 | cities | Contains quality of life ratings for major U.S. metropolitan areas |

38 | data_exponential | Contains 2D poitns of two classes that are nearly linearly separable in a standard linear basis |

39 | outliersData | Contains 2D data exhibiting a strongly linear relationship but with numerous outliers |

40 | regressOnOne | Contains 2D data with constant variance across the range of x exhibiting a linear relationship between the dependent and independent variables |