![]() |
|
Functions | |
| gsl_rng * | apop_rng_alloc (int seed) |
| apop_data * | apop_jackknife_cov (apop_data *in, apop_model *model) |
| apop_data * | apop_bootstrap_cov (apop_data *data, apop_model *model, gsl_rng *rng, int iterations, char keep_boots, char ignore_nans) |
Copyright (c) 2006–2007 by Ben Klemens. Licensed under the modified GNU GPL v2; see COPYING and COPYING2.
| apop_data* apop_bootstrap_cov | ( | apop_data * | data, |
| apop_model * | model, | ||
| gsl_rng * | rng, | ||
| int | iterations, | ||
| char | keep_boots, | ||
| char | ignore_nans | ||
| ) |
Give me a data set and a model, and I'll give you the bootstrapped covariance matrix of the parameter estimates.
| data | The data set. An apop_data set where each row is a single data point. (No default) |
| model | An apop_model, whose estimate method will be used here. (No default) |
| iterations | How many bootstrap draws should I make? (default: 1,000) |
| rng | An RNG that you have initialized, probably with apop_rng_alloc. (Default: an RNG from apop_rng_get_thread) |
| keep_boots | If 'y', then add a page to the output apop_data set with the statistics calculated for each bootstrap iteration. They are packed via apop_data_pack, so use apop_data_unpack if needed. (Default: 'n') |
| ignore_nans | If 'y' and any of the elements in the estimation return NaN, then I will throw out that draw and try again. If 'n', then I will write that set of statistics to the list, NaN and all. I keep count of throw-aways; if there are more than iterations elements thrown out, then I throw an error and return with estimates using data I have so far. That is, I assume that NaNs are rare edge cases; if they are as common as good data, you might want to rethink how you are using the bootstrap mechanism. (Default: 'n') |
apop_data set whose matrix element is the estimated covariance matrix of the parameters. | out->error=='n' | NULL input data. |
| out->error=='N' | too many Nans.
|
| apop_data* apop_jackknife_cov | ( | apop_data * | in, |
| apop_model * | model | ||
| ) |
Give me a data set and a model, and I'll give you the jackknifed covariance matrix of the model parameters.
The basic algorithm for the jackknife (with many details glossed over): create a sequence of data sets, each with exactly one observation removed, and then produce a new set of parameter estimates using that slightly shortened data set. Then, find the covariance matrix of the derived parameters.
Jackknife or bootstrap? As a broad rule of thumb, the jackknife works best on models that are closer to linear. The worse a linear approximation does (at the given data), the worse the jackknife approximates the variance.
Sample usage:
| in | The data set. An apop_data set where each row is a single data point. |
| model | An apop_model, that will be used internally by apop_estimate. |
| out->error=='n' | NULL input data. |
apop_data set whose matrix element is the estimated covariance matrix of the parameters.