Blog
Back to Blog

How to Analyze qPCR Data: A Complete Guide

qPCR analysis is a five-step pipeline: get clean Ct values, choose a quantification method that matches your experimental design, normalize to validated reference genes, compute fold changes per biological replicate, and run statistics on the ΔCt scale instead of on fold changes. None of these steps are hard in isolation. Most data-quality problems come from doing them in the wrong order or skipping the validation steps.

The mistakes I see most often are always the same four. People run t-tests on fold-change values that are log-normal. They normalize to a single reference gene because that's what the protocol said, without ever checking whether GAPDH is actually stable in their treatment condition. They ignore primer efficiency mismatch between target and reference. And they treat technical replicates as separate samples, inflating n and getting p < 0.05 on essentially three biological observations. Below is the workflow that avoids all of that.

Start with clean Ct values

Before any math, you need clean Ct values — export them from your instrument and inspect them, don't just feed the file into a spreadsheet.

Instrument exports vary in format but carry the same core fields: well, sample, target, Ct (or Cq), and sometimes Tm and efficiency. A QuantStudio gives .eds or .xls; a CFX96 gives .pcrd or CSV; a LightCycler 480 gives .ixo; a Rotor-Gene Q gives .rex. The column you want is Cq (Roche) or Ct (everyone else) — same thing.

Check the baseline and threshold first. Most instruments auto-set these and they're usually right, but scan the amplification plots: weird sigmoidal shapes, baseline drift in early cycles, or an exponential phase starting before cycle 5 mean you should reset the threshold manually before exporting. A bad threshold trashes efficiency calculations.

Next, technical replicate CV. Triplicates should agree within 0.5 Ct; flag anything > 1 Ct. Usually one of three is the outlier — drop it and check the other two converge. If two of three diverge, the sample is suspect, not the well.

Sanity-check no-template controls. NTC should not amplify, or amplify at Ct > 35 above your latest experimental Ct. NTC at Ct 30 with a clean melt means primer-dimer contamination or carryover, and the plate is not trustworthy. NRT (RT-minus) should match NTC; an earlier NRT means genomic DNA contamination.

For SYBR Green, inspect every melt curve. A single clean peak at the expected Tm is good; shoulders or extra peaks mean non-specific product and that Ct is meaningless. TaqMan skips this check, which is one reason to use it. See SYBR Green vs TaqMan, melt curve analysis, and the qPCR troubleshooting guide for what to do when these checks fail.

Pick a quantification method

Four methods cover almost every qPCR experiment. Pick by what you measured and how matched your primer efficiencies are.

ΔΔCt (Livak & Schmittgen 2001) is the default for relative gene expression when target and reference primer efficiencies match within ~5%, both in the 90–110% range, and you have a clear control group. Most experiments fit. See the step-by-step DDCt walkthrough.

Pfaffl (Pfaffl 2001) is the right answer when target and reference efficiencies diverge. ΔΔCt assumes both equal exactly 2; if your target amplifies at 95% efficiency and your reference at 105%, ΔΔCt reports a fold change off by a multiplicative factor that grows with ΔCt. Pfaffl substitutes measured efficiencies: ratio = (E_target)^ΔCt_target / (E_reference)^ΔCt_reference. Standard curves per primer measure efficiency once per assay. See Pfaffl when DDCt assumptions fail.

Standard curve quantification gives absolute copy numbers per reaction. Use it for viral load, ChIP-qPCR, gene copy number, miRNA targets, or any cross-plate comparison. The curve should span 5–6 log-fold dilutions, have R² > 0.98, and an efficiency in the 90–110% window. See standard curve best practices and the broader absolute vs relative quantification trade-off.

geNorm (Vandesompele 2002) is a reference-gene selection tool, not strictly a quantification method. Use it before ΔΔCt or Pfaffl when you don't know which housekeepers are stable in your conditions. More on that in the next section.

Practical decision rule: start with ΔΔCt; if your standard curves show > 5% efficiency divergence, switch to Pfaffl; if you need copy numbers, run a standard curve; always validate reference genes with geNorm at least once per cell type.

Choose and validate reference genes

The single biggest source of bad qPCR data in the literature is unvalidated reference genes. Vandesompele et al. (2002) showed that the so-called "housekeeping" genes — GAPDH, ACTB, 18S — are not stable across tissues, treatments, or developmental stages. Normalize to a gene that itself changes with your treatment and the fold changes you report are artifacts of the reference, not the target.

Never normalize to a single reference gene. Pick 2–3 candidates from a diverse pool — GAPDH, ACTB, HPRT1, B2M, 18S, RPL13A, YWHAZ, TBP are common starting points — and validate stability with geNorm or NormFinder (Andersen et al. 2004). geNorm assigns each gene an M value; pairs with M < 1.5 are reasonably stable. NormFinder computes stability values that explicitly model intra- and inter-group variation, which makes it the better choice when you have multiple experimental groups.

Normalize to the geometric mean of your validated references, not the arithmetic mean. Ct values are log-transformed by definition; arithmetic averages systematically over-weight high-Ct (low-abundance) readings.

Validate once per cell type or tissue, not once per experiment. Going from HEK293 to primary hepatocytes? Redo it; your stable panel will change. See picking the best reference genes for tissue-specific candidates and multi-reference normalization for the geometric-mean math.

Compute fold change correctly

The math is mechanical once you have validated reference genes and a method. The trap is in the order of operations: do it per biological replicate, not by pooling.

  1. For each well, average the technical replicates → one Ct per (sample, target) pair.
  2. For each biological replicate, compute ΔCt = Ct_target − Ct_reference (or the geometric mean of multiple references).
  3. For each biological replicate in the treatment group, compute ΔΔCt = ΔCt_treatment − mean(ΔCt_control).
  4. Fold change = 2^−ΔΔCt.

Worked example: three control mice and three treated mice, measuring IFNB1 relative to HPRT1.

Mouse Ct IFNB1 Ct HPRT1 ΔCt ΔΔCt Fold
Ctrl-1 28.5 22.0 6.5 0 1.00
Ctrl-2 28.7 22.1 6.6 0.1 0.93
Ctrl-3 28.3 21.9 6.4 −0.1 1.07
Trt-1 24.0 22.2 1.8 −4.7 26.0
Trt-2 24.4 22.0 2.4 −4.1 17.2
Trt-3 23.8 22.1 1.7 −4.8 27.9

Control ΔCt mean is 6.5. Per-mouse ΔΔCt = (treatment ΔCt) − 6.5; per-mouse fold change = 2^−ΔΔCt. The number to report is the geometric mean fold change (~23.4) with SD computed on the ΔΔCt scale, not on fold change directly. Pooling all treatment Ct values and computing one fold change throws away between-mouse variability and leaves you no way to compute a confidence interval.

If your standard curves showed efficiency divergence, swap step 4 for Pfaffl: ratio = (E_target)^−ΔCt_target / (E_reference)^−ΔCt_reference, per biological replicate.

Run statistics on the right scale

This is where most published qPCR analyses fall apart. Fold change is log-normal — many small values close to 1 and a long right tail for upregulated genes. t-tests and ANOVA assume the input is normal. Run a t-test on raw fold change and your p-values are wrong, often spuriously significant because the long tail looks like a treatment effect.

The fix is mechanical: run all statistics on ΔCt (equivalently, log₂ fold change), then back-transform the means to fold change for plotting. ΔCt is normally distributed for well-behaved assays, so t-tests, ANOVA, and mixed models all work as advertised. For two groups: a two-sample t-test on ΔCt. For three or more: one-way ANOVA followed by Tukey HSD or Bonferroni-corrected pairwise comparisons. If samples are split across multiple plates, use a linear mixed model with plate as a random effect — otherwise you're treating between-plate variation as biological signal.

Your n is biological replicates, not technical replicates. Technical replicates are pre-averaged in step 1 above; they're a precision measurement for each biological sample, not independent observations. Three mice with three technical replicates each is n = 3, not n = 9. Treating n = 9 inflates statistical power, generates false positives, and is one of the most common errors caught in peer review.

For plotting: mean ± SD (or 95% CI) on a log₂ y-axis with fold-change labels on a secondary axis. Overlay individual data points on the bar or box so a single outlier driving the effect is visible. Avoid SEM — for n = 3, SD is more honest than the SEM bar, which looks artificially small. See analyzing biological replicates for the worked statistics.

Report with MIQE-grade detail

The MIQE guidelines (Bustin et al. 2009) define what a complete qPCR report looks like, and most journals now expect adherence. The full checklist is 80+ items but the essential subset fits in a paragraph.

For methods, include: instrument model (QuantStudio 5, CFX96, LightCycler 480 II, etc.); RNA extraction kit and quality metrics (RIN if applicable); cDNA synthesis kit and input mass; master mix; primer sequences with concentrations (usually 200–400 nM each); annealing temperature and cycling protocol; standard curve efficiency and R² per assay; reference genes used and the validation evidence (citation or your own geNorm/NormFinder results); and quantification method (ΔΔCt, Pfaffl, or standard curve).

For results, include: n biological and n technical replicates separately; how outlier wells were handled; fold changes with confidence intervals or SD; exact p-values (not just stars); and the statistical test used. For multiplex assays, also report channel assignment and the cross-talk check between channels — multiplex artifacts where one channel bleeds into another are easy to miss and silently ruin the analysis. See multiplex qPCR and qPCR primer design for the design-side detail that belongs in this section.

Doing this pipeline by hand in Excel takes longer than the experiment itself, and gives you ample room to make every mistake in the intro. Upload your raw run files to VoilaPCR and it'll handle QC, method selection, reference-gene validation, fold change, and statistics in about ten seconds.