svm-subset: A subset selection tool for libsvm

SYNOPSIS

svm-subset [ -s method ] dataset number [ output1 ] [ output2 ]

DESCRIPTION

Training large data is time consuming. Sometimes one should work on a smaller subset first. The python script subset.py randomly selects a specified number of samples. For classification data, we provide a stratified selection to ensure the same class distribution in the subset.

OPTIONS

-s method

0: -- stratified selection (classification only) (default)
1: -- random selection
output1 The subset. If output1 is omitted, the subset will be printed on the screen. output2 The rest of data.

FILES

See svm-train(1) for the format of dataset

EXAMPLES

svm-subset heart_scale 100 file1 file2

From heart_scale 100 samples are randomly selected and stored in file1. All remaining instances are stored in file2.

BUGS

Please report bugs to the Debian BTS.

AUTHOR

Chih-Chung Chang, Chih-Jen Lin <[email protected]>, Chen-Tse Tsai <[email protected]> (packaging)

RELATED TO svm-subset…

svm-train(1), svm-predict(1)

svm-subset (1)