The Azimuth Project
Power law (changes)

Showing changes from revision #10 to #11: Added | Removed | Changed

Power law


A dependent variable is said to follow a power law if it has the form Cv βC v^\beta. It is unfortunately sometimes used as an abbreviation of “power law distribution”, i.e., a probability distribution which follows a power law. The term “power law” is also sometimes used when what is really meant is just a heavy tailed distribution?.


The basic form of a power law relating some variable vv to an observed quantity A(v)A(v) is

A(v)=Cv βA(v)= C v^\beta

where in practice modelling real data the equality becomes approximation. One particularly common variety of power law is where AA denotes frequency of occurrence. This is one model which has the property that “bigger things are rarer” whilst retaining significant mass in the tail. Another property is scale invariancescale invariance, for example, that the ratio of AA values for a pair of vv values depends only on the vv values ratio and not their exact values.

Examples of kinds of curves corresponding to power laws are shown below (with a linear function for comparison).

plot of typical power law curves

A power law is in contrast to other models such as a Gaussian distribution (which has a large proportion of the population close to the mean) or a uniform distribution (where any value in the distribution is equally “typical”).

Generating processes

One of the reasons that whether a function is a power law is interesting is that various sophisticated processes from physics and mathematics are known to generate power laws. Thus finding a power law in a new field may motivate work on the development of new causal models using elements from these known models. However, there are two important caveats:

  1. Many quite different processes generate power laws.

  2. From a prioritization perspective, it is important to be sure of that the empirical fit to a power law obtained is strong statistical evidence that it is a power law (see section Finding/validating power laws).

Arguably, it is more promising to start with a generating process that can be validated directly and then determine if this process gives rise to a power law.

Similarly, if the only “use” of the fitted power law is to argue that it is important to know the distribution is heavy tailed? this can be inferred directly without the fitted function as an intermediary. Likewise, if the function is to be used for black-box prediction it is debatable whether non-parametric density estimation isn’t a better approach.

Finding/validating power laws

Many quantities arising from complex processes, such as ecological ones, can appear to empirically fit power laws reasonably well. However, it is important to note that, particularly when dealing with discrete phenomena with an upper-bound cut-off on samples obtained, many other analytic characterisations generate sample data which produces the same kind of curve. Thus to rigorously validate a “proposed” power law is difficult.

In particular, it appears that the some of the fitting procedures used in the literature are very problematic. After converting the data into the double logarithmic logA\log A-logv\log v representation these are:

  1. Data binning? to reduce the data volume: Although slightly better than binning before log-transformation, considering data set sizes and modern computing power there is no need to do this, and it adds significantly to the error in the results. In addition, doing this “artificially inflates” the R 2R^2 measure of the fit.

  2. Using some variety of least squares? to fit a line through the remaining points: This is very dubious since almost all the modeling assumptions about least squares fitting do not apply in this set-up. Even when the data is generated by a power law, poor parameter estimates result.

  3. Using R 2R^2 as a goodness of fit criterion for the power law: Other measures of goodness of fit against the original data against the fitted power law are much more discriminatory than the R 2R^2 value of the least squares line fit.

In contrast to this approach, better statistcial fitting techniques exist based upon applying a maximum likelihood estimator to the full dataset. These are perfectly computationally feasible and have been known for 60 years. There are also techniques for more convincing comparison against other simple candidate distributions – e.g., log-normal – to determine which is the better fit.

The references to Clauset, Shalizi and Newman’s work describes both these problems and their solutions. It also analyses 24 instances of claimed power laws in the scientific literature and finds truly compelling evidence for a power law in one case.


Some typical examples of properties which appear to be heavy tailed and have been claimed to be power laws are:

city populationfrequency
monetary wealthfrequency
stellar massfrequency
animal massmetabolic rate
flying animal body massoptimal cruising speed


  • Power law, Wikipedia.

  • Clauset, A, Shalizi, C R and Newman, M E J. Power-law distributions in empirical data. SIAM Review 51: 661–703, 2009. doi:10.1137/070710111.

    • Arxiv version.

    • Code repository.

    • Abstract: Power-law distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and man-made phenomena. Unfortunately, the empirical detection and characterization of power laws is made difficult by the large fluctuations that occur in the tail of the distribution. In particular, standard methods such as least-squares fitting are known to produce systematically biased estimates of parameters for power-law distributions and should not be used in most circumstances. Here we describe statistical techniques for making accurate parameter estimates for power-law data, based on maximum likelihood methods and the Kolmogorov-Smirnov statistic. We also show how to tell whether the data follow a power-law distribution at all, defining quantitative measures that indicate when the power law is a reasonable fit to the data and when it is not. We demonstrate these methods by applying them to twenty-four real-world data sets from a range of different disciplines. Each of the data sets has been conjectured previously to follow a power-law distribution. In some cases we find these conjectures to be consistent with the data while in others the power law is ruled out.

  • Weblog entry giving highlights of the Clauset et al paper.

    A slightly barbed text which nevertheless outlines the statistical arguments as to why finding strong evidence for a power law is rare.