Welcome to mftrees’s documentation!¶

Training a Model¶

The first step in training a model is to generate training data from a source imagery mosaic, extra augment layers, and an target map. This is done using the mft.features program. This program outputs a .npz file containing the generated training features, as well as extra metadata parameters that will be passed through to subsequent steps in the modelling process.

Relevant parameters, an example invocation.

mft.features¶

MOSAIC_FILE: An image (likely VRT) to chip and compute training features from

mft.features [OPTIONS] MOSAIC_FILE

Options

-t, --target-map <target_map>¶: A lower resolution target georeferenced image that will control the chipping behavior, as well as training data values

--bins <bins>¶: Number of freq bins to use for spectra generation

--pixel-size <pixel_size>¶: rescaled pixel size

-o, --out <out>¶

-a, --augment-file <augment_file>¶

Arguments

MOSAIC_FILE¶: Required argument

The next step is to compute a manifold embedding and train an xgboost regressor. These steps are accomplished using the mft.train program. This program outputs a model as a .joblib package that can then be applied to new data to make predictions.

Relevant parameters, an example invocation.

mft.train¶

TRAINING_FILE: NumPy serialized file where ‘arr_0’ is the input feature matrix

mft.train [OPTIONS] TRAINING_FILE

Options

--embed, --no-embed¶: Transform features via sampled spectral embedding prior to fit

--n-components <n_components>¶: Number of features to use for Nystroem extension

--n-boosting-stages <n_boosting_stages>¶: Max number of Gradient Boosting Stages

-c, --n-clusters <n_clusters>¶: Number of k-means clusters

-d <d>¶: Number of output dimensions

-of <of>¶: npz feature output filename

-s, --seed <seed>¶: random seed for test/train partition

-lr, --learning-rate <learning_rate>¶: learning rate for xgboost

--gpu¶

--hist¶

--approx¶

--tree-depth <tree_depth>¶: Max tree depth in ensemble

--augments-only¶: Use only augment values for fitting clustered data

--max-projection-samples <max_projection_samples>¶: Max number of approximated features to use for Spectral Embedding

Arguments

TRAINING_FILE¶: Required argument

mft.histmatch¶

Histogram match a georeferenced raster to a reference

mft.histmatch [OPTIONS] IMG_PATH

Options

-o, --out_path <out_path>¶: classification output geotiff

-r, --ref_path <ref_path>¶: Reference mosaic used for baselayer matching

Arguments

IMG_PATH¶: Required argument

mft.predict¶

MODELS_FILE: joblib-serialized carbon estimation model

mft.predict [OPTIONS] MODEL_FILE

Options

--mosaic-file <mosaic_file>¶: Preprocessed image mosaic file as a GeoTIFF

-a, --augment-file <augment_file>¶: Prepressed augmentation data file as a GeoTIFF

-o, --out <out>¶: classification output geotiff

--blm, --no-blm¶: Base Layer Match mosaic to reference

--reference <reference>¶: Reference mosaic used for baselayer matching

Arguments

MODEL_FILE¶: Required argument

Welcome to mftrees’s documentation!¶

Training a Model¶

mft.features¶

mft.train¶

mft.histmatch¶

mft.predict¶

Indices and tables¶

Table of Contents

This Page