The Azimuth Project
Experiments in fitting bivariate power laws (Rev #1)

Experiments in fitting bivariate power laws


This page looks at fitting a power law to an urban statistics dataset. The freely available dataset found is the relationship between city size and the number of list of cities with the most high-rise buildings. A high-rise is defined as a structure at least 35 meters or 12 stories tall. It’s unclear that this should be determined by a power law, but given tall buildings are generally erected to accomodate high-level economic activity, which is the kind of thing that is argued to depend on population size via a power law. The data comes from wikipedia and no attempt has been made to verify the data or apply any “corrections” such as removing outliers. The data is also right censored at a minmum of 100 high-rise buidlings.

Since we’re considering a situation where the dominant deviation from a power law is likely to, be due to e.g., a city being underdeveloped or overdeveloped due to unique factors, rather than measurement error, we’ll refer to differences from the basic power law as “deviations” rather than “errors”.

The aim of this page is not to find “the correct” power law fit for this data, but to look at the question of whether convincing power law fitting is possible in the absence of at least an idea of what kind of deviations are believed to be occurring.

Visualising the dataset

power law test original data

Original data.

power law test double log

Double logarithm data.

Arguably both these plots could suggest there are some outliers which may for some reason be generated by an anomalous process. Many fitting methods, including maximum likelihood can be heavily skewed by such outliers and they should be removed for best results. However, removal can only be done in a principled way if there are pre-existing theory justifying their specialness, and we’re considering the case of trying to fit a power law in advance of building a theory.

Fitting using likelihood given a devation models

We look at two very simple deviation models: additive normal deviation in the original data and additive normal deviation in the original data (which is log-normal deviation in the double logarithm data). Here vv is the city population and bb is the number of high-rise buildings.

modelhenceelt log likelihoodminimumC
b~Cv β+N(0,σ)b ~ C v^\beta+N(0,\sigma)bCv β~N(0,σ)b-C v^\beta ~ N(0,\sigma) i(b iCv i β) 2/σ 2\sum_i (b_i-C v_i^\beta)^2/\sigma^2[0.6925,0.695][0.6925,0.695]0.031411
b~Cv β(1+N(0,σ))b ~ C v^\beta(1+N(0,\sigma))b/(Cv β)1~N(0,σ)b/(C v^\beta)-1 ~ N(0,\sigma) i(b i/(Cv i β)1) 2/σ 2\sum_i (b_i/(C v_i^\beta)-1)^2/\sigma^2[0.255,0.2575][0.255,0.2575]43.423983

The model technically has three parameters (C,β,σ)(C,\beta,\sigma). However, providing σ\sigma is essentially independent of the other parameters it only scales the likelihood and so does not affect the location of the minimum. Additionally, using the log likelihood the value of CC at the minimum has a simple form in terms of the data and β\beta. So fitting collapses to just a 1-D optimisation.

Plots of the negated log likelihoods of the two models (over the same range of β\beta values). Note that because the deviation term has a different meaning in the different models, the fact the two functions have very different ranges of values doesn’t mean anything.

power law test model 1 NLL

Model 1 negative log likelihood plot.

power law test model 2 NLL

Model 2 negative log likelihood plot.

The β\beta range within which the minimum falls, and approximately the corresponding CC value, are recorded in the table. As can be seen there is a dramatic difference between the two values.

power law test original data

It’s highly debatable if either of these fitted power laws are a good fit to the data, but it is clear that very different results are obtained depending on the deviation model chosen. This raises doubts whether it’s possible to do find power laws in marginal data purely by fitting procedures.

Source scripts will be available here.

category: methodology