# Exporting Statistics¶

After you’ve gated populations, you can export a variety of statistics.

Howto

- Click
**export statistics**in the left sidebar. - Select the statistics to calculate.
- Select the populations to calculate in the
**Populations**selector. *For mean, median, geometric mean, CV, StdDev, MAD or quantile statistics,*select the channels to calculate in the**Channels**selector.- Select the FCS files to calculate from in the
**FCS Files**selector. - Select the compensation to use for gating.
- Select the output file format (TSV or CSV with or without header, or JSON).
- Select the output file layout (see descriptions below).
- Click
**download**.

For TSV and CSV exports, three layouts are available:

Layout | Description |
---|---|

Tall-Skinny | One row per combination of FCS file, population, statistic and channel. All statistics are in a single column titled `value` . This format is ideal for use with applications such as TIBCO Spotfire® that filter rows to isolate the data of interest. |

Medium | One row per combination of FCS file, population and channel. Each statistic is in a separate column. |

Short-Wide | One row per FCS file. Each combination of population, statistic and channel is in a separate column. This format is provided for users accustomed to FlowJo® output. This format is not readily machine-parsable and cannot include population IDs (only names). |

Exports include file annotations for convenience.

Exports optionally include the IDs of FCS files and populations in addition to their names. If you are consuming exported data in analysis scripts, IDs provide an immutable reference, unlike names, which can be changed by users.

The `uniquePopulationName`

property has the names of parent populations
prepended until the name is unique. If all of your population names are unique,
then this value will be the same as the `population`

name property.

## Statistic Types¶

Tip

NaN (not-a-number) or N/A values will occur in the following scenarios:

- When calculating channel statistics (mean, median, etc.) for a gate that contains no events
- When calculating the geometric mean for a gate that contains 0 or negative values

### Median¶

The median is a special case of a quantile and represents the center point of a set of observations. In the case of an even number of observations, linear interpolation is used (i.e. the mean of the two tied values is used).

Compared to the arithmetic mean, this value is less sensitive to outliers and is thus ideal for avoiding confounding effects of experimental noise.

### (Arithmetic) Mean¶

### Quantile (Percentile)¶

*Definitions at MathWorld,
Wikipedia*

The threshold value below which the specified amount of data points fall. For example, if the 90th quantile is 23,104, that means 90% of data points are below 23,104.

There are at least nine definitions of “quantile” in common use. CellEngine uses the median-based estimate definition (definition 8 in R and Hyndman and Fan 1996). This definition is continuous (meaning that it interpolates between values), independent of the underlying distribution of the data, and median-unbiased. Because of these and several other qualities, it is the definition recommended by Hyndman and Fan.

### Geometric Mean¶

Note that the geometric mean will be undefined for populations that have any values less than zero because the formula takes the square root of all values, and the square root of a negative is a complex number. The geometric mean will also be undefined for populations that have any values equal to zero. Geometric means of zero can be misleading because a single zero (which may be an outlier) in a dataset causes the entire geometric mean to be zero.

Tip

The use of geometric means in flow cytometry is largely a holdover from old,
analog cytometers that stored data in logarithmic form. The arithmetic mean
of log-transformed values is equal to the log of the geometric mean, so it
was more convenient to calculate that than convert from log back to linear.
Because modern instruments store high-resolution list-mode data, and because
the geometric mean cannot be calculated when the dataset contains negative
values (as is common with compensated and background-subtracted data), **the
median is generally a more suitable statistic**. In fact, the geometric mean
and the median are equal for log-normal distributions, and most biological
data is presumed to be log-normal, so in that regard they can be considered
interchangeable. See
page 235 of Shapiro’s *Practical Flow Cytometry*
for more information.

### Event Count¶

The event count is the number of events in a population.

(The word *event* is used instead of *cell* because flow cytometry may be used to
analyze a wide variety of particles, such as virions, bacteria, fungi and beads.)

### Percent of ___¶

The percent is the number of events in a population divided by the number of events in the specified ancestor population.

### Standard Deviation (StdDev)¶

*Definitions at MathWorld,
Wikipedia*

CellEngine reports the *population* standard deviation (as opposed to the
*sample* standard deviation).

### Coefficient of Variation (CV)¶

The coefficient of variation is the standard deviation divided by the mean, resulting in a relative variation metric (standard deviation is an absolute variation metric).

### Median Absolute Deviation (MAD)¶

The median of the absolute deviations. This is a robust alternative to the standard deviation. Multiply this value by 1.4826 to achieve BD FACSDiva’s “robust standard deviation” (rSD) (see BD’s Tech Note).

### See also¶

- Exporting dose-response values.