Alex Bligh's blog

Alex Bligh's personal blog

I had an interesting situation today where qemu-img create performed oddly when using one particular NFS filer. The symptoms were:

  • with -f qcow2 it worked as expected, and a 500G image is approximately 1MB
  • with -f qcow2 -o preallocation=metadata, the image took hundreds of gigabytes

This is not meant to happen. The files are meant to be sparse (i.e. have holes in them). The metadata preallocated is pretty small. On any other NFS filer I’ve tried, and on any local filing system I’ve tried, this works as expected.

I therefore needed to narrow the problem down, and armed with an strace of what qemu-img was actually doing, I built sparsetest which creates a sparse file.

The results are pretty interesting. Here’s a normal ext4 volume.

$ ./sparsetest -b 4K -s100M -w1M test
  Intended logical size:       104857600 bytes;             100 M;           25600 blocks of 4096 bytes
  Optimum physical size:          409600 bytes;               0 M;             100 blocks of 4096 bytes
   Actual physical size:          409600 bytes;               0 M;             100 blocks of 4096 bytes

Used 100 writes of 4096 bytes every 1048576 bytes in ascending order
Created 800 512 byte blocks on disk
Density as % of actual physical size over logical size: 0.390625 %
Efficiency as % of optimum physical size over actual: 100.000000 %

What I’ve asked it to do there is write a 100MB (logical size) file, and every 1MB of data, write 4K of random junk. At the end of writing the file, it uses ftruncate to set the logical extent to exactly 100MB.

So the 100 4K sections of random junk are the only content in the file. The file is gratifyingly 409,600 bytes long – exactly as it should be, so 100% efficient at encoding the sparse nature of the file. And it’s 0.39% dense, i.e. 0.39% of the logical space is reflected in physical space.

I have a flag to set the logical extent of the file using ftruncate at the start rather than at the end. Unsurprisingly, this makes no difference here.

$ ./sparsetest -i -b 4K -s100M -w1M test
  Intended logical size:       104857600 bytes;             100 M;           25600 blocks of 4096 bytes
  Optimum physical size:          409600 bytes;               0 M;             100 blocks of 4096 bytes
   Actual physical size:          409600 bytes;               0 M;             100 blocks of 4096 bytes

Used 100 writes of 4096 bytes every 1048576 bytes in ascending order
Created 800 512 byte blocks on disk
Density as % of actual physical size over logical size: 0.390625 %
Efficiency as % of optimum physical size over actual: 100.000000 %

So, let’s see what happens on the filer in question:

$ ./sparsetest -b 4K -s100M -w1M /path/to/test
  Intended logical size:      104857600 bytes;            100 M;          25600 blocks of 4096 bytes
  Optimum physical size:         409600 bytes;              0 M;            100 blocks of 4096 bytes
   Actual physical size:      131252224 bytes;            125 M;          32044 blocks of 4096 bytes

Used 100 writes of 4096 bytes every 1048576 bytes in ascending order
Created 256352 512 byte blocks on disk
Density as % of actual physical size over logical size: 125.171875 %
Efficiency as % of optimum physical size over actual: 0.312071 %

Eek! My 100MB sparse file is no longer sparse. In fact it’s negatively sparse! It uses 125MB on disk (a density of 125%). And the efficiency is tiny (0.3%).

So out of interest, let’s run it calling ftruncate before writing the data to the file, so that the writing itself never expands the file.

$ ./sparsetest -i -b 4K -s100M -w1M /path/to/test
  Intended logical size:      104857600 bytes;            100 M;          25600 blocks of 4096 bytes
  Optimum physical size:          409600 bytes;             0 M;            100 blocks of 4096 bytes
   Actual physical size:          413696 bytes;             0 M;            101 blocks of 4096 bytes

Used 100 writes of 4096 bytes every 1048576 bytes in ascending order
Created 808 512 byte blocks on disk
Density as % of actual physical size over logical size: 0.394531 %
Efficiency as % of optimum physical size over actual: 99.009901 %

Well that’s pretty much normal.

So what’s happening here is that when a sparse file is expanded using ftruncate, the filer is fine. I the sparse file is expanded using pwrite at an offset beyond the end of the file, bad things happen; it would appear the amount by which the file is extended (or indeed more) is allocated to the file.

I suspect this may not be the filer vendor (who I am not naming unless they want me to), but rather a product of the (Linux based) underlying filesystem that the filer vendor uses (I don’t know what that is yet). I suspect this has something to do with treatment of how the write is journaled.

I’ve never seen this before. But if you want to test your filesystem’s treatment of sparse files, here‘s some GPL code that will let you do it.

I was so excited to receive my five new 46cm/18″ Jala Beech spoons from Amazon today. Well, I was so excited to receive the five massive boxes each containing one spoon, that I thought you’d all enjoy a step-by-step account of the unboxing.

Let’s first have a look at how they looked when they arrived. You’ll notice Amazon have thoughtfully ensured that each spoon is separately packaged in its own voluminous cardboard casket, to avoid any inter-spoon crosstalk during the delivery process. Well done Amazon here, as I bought these as an add-on item, so might have expected them to share some of the copious amounts of packaging one of the other ten things in my order took up, but no, Amazon saw fit to do the right thing and ensure each valuable spoon was carefully cosseted in its own cardbox box.

Spoons in boxes
Spoons in boxes

We couldn’t wait to unpack them. Note the well cushioned packaging protecting each valuable spoon (cost: £3.67) from any vibration damage in transit, plus the copious amount of space around each spoon cleverly inserted to allow each spoon to breath:

Spoon cheekily peeking out of box
Spoon cheekily peeking out of box

And here they are in their full unpacked wooden glory (with associated packaging) on my kitchen table:
Full frontal naked spoons

Full frontal naked spoons

And yes, I’m on Prime, so paid no delivery charges for this. I’ll always order my spoons this way in future.

I don’t normally blog about Flexiant related stuff here, but this is a fun open-source skunkworks project, so I’m making an exception.

We’ve released, an apache licensed tool that allows you to use the Docker command line you are familiar with across multiple clouds.

The announcement is below, but you’ll get a better idea from the website or by reading the source code. The website has a really useful tutorial that explains things better than I can here.

Date: Tue, 11 Nov 2014 04:25:40 -0800 (PST)
From: Javi Perez-Griffo <javi@...>
Message-Id: <>

For the last months I have been working on a side project within my company for the management of Docker instances across multiple clouds. Today I would like to annouce the first alpha version of Krane (

Krane is built on the Docker code base, so supports the existing Docker command-line in a multi-cloud environment. This allows Docker users to use their existing workflows to launch apps transparently in multiple clouds, saving time and further reducing the friction of moving workloads between development, QA and production.

Eventually we’d like to get upstream Krane into the Docker code base. The patch itself is relatively small (a few hundred lines of code). It’s written 100% in Go, and is released under the Apache 2.0 licence.

In Krane, a ship is a virtual machine (VM) running a Docker instance into which Krane will place Docker containers. Krane (running on the developer’s command line) communicates with Docker running on the ship via its API (you can run Krane on the ship if you like, but it’s not necessary). Using Krane, you can execute your Docker command (such as starting or stopping a Docker container) on any ship. Ships are built as standard with a cadvisor container, which monitors performance of the ship and its containers.

A Krane shipyard is a driver for Krane that is capable of building, destroying and controlling ships. We’ve initially provided two drivers. Firstly, a driver for AWS. Secondly, a driver for Flexiant Concerto, a multi-cloud manager that supports AWS, Digital Ocean, Rackspace, Joyent and of course Flexiant Cloud Orchestrator (we’re giving away free accounts). We would appreciate and encourage contributions of further shipyard drivers.

Krane can currently do all commands displayed in the tutorial. We have intention of adding more commands to the list over time.

We are aware that there has recently been an announcement around multicloud management in Docker – see:

We’re keen to work with the Docker community to integrate the two ideas or give back our code to them.

You can learn more about Krane here:
The code is on Github here:

Have fun! Javi

I had a connection with BeThere with my BeBox in bridged mode, with a Mikrotik router behind it. My connection got migrated to Sky when Sky bought the Be business. I lost all but one of my static IPs, but more importantly, bridged mode stopped working. I rung Sky and their helpful advice was that they don’t support multiple static IPs, and they don’t support bridged mode. They said I’d find instructions on how to run bridged mode in the forums, but those instructions simply aren’t there (not to any working degree anyway). Given they’d effectively broken my entire DSL connection and had no solutions to proposed, I asked them whether I should simply take a MAC code and my business elsewhere. The guy on the other end of the line was clearly used to this from Be customers and had no issue with this. However, first I decided to see whether I could fix it.

The problem here turns out to be that Sky only do PPPoA (not PPPoE), and don’t support what used to be bridged mode in the BeBox. The BeBox is in fact (in my case anyway) a Technicolor TG582n, and it is (just about) possible to get this working in a sort of bridged mode, by using a little known feature of the TG582n to bridge from PPPoA to PPTP.

For the benefit of Sky customer service: I know supporting PPPoE as well as PPPoA might have been a minor bother. However, you could relatively easily have worked out how to do this bit for Be customers. And no doubt you could also easily have worked out a way of putting the Sky box you shipped me into some form of bridged mode too. How many customers were you intending to lose?

Anyway, here’s how you do it.

Step 1:

Connect your TG582n direct to your computer and remove the ADSL connection.

Reflash your TG582n with a sensible firmware image. Some helpful person might supply you with a copy of the software. If you have Windows, there is a software update tool. If you don’t, it will tftp boot. Mine doesn’t on power up, but the ‘software upgrade‘ command works from the administrator CLI. I have a DANT-1 board. Apparently there are different board versions needing different firmware. The main thing is to remove the Be firmware.

Step 2:

Restore the factory settings on your TG582n. Do this by ‘telnet‘ (username ‘Administrator‘), then type:

:system reset factory=yes

This will reset your password to the default for the firmware (possible to an empty password, possibly to your serial number).

Step 3:

Now put your TG582n into PPTP bridge mode. To do this, follow the instructions here. In short, log back in and type:

:ppp flush
:eth flush
:atm flush
:atm phonebook flush
:atm phonebook add name=vpi0vci38 addr=0*38
:service system modify name=PPTP state=enabled
:pptp flush
:pptp profadd name=vpi0vci38 encaps=vcmux
:ipqos config dest=vpi0vci38
:wireless ifconfig state=disabled

Note: I am told that previous editions of the firmware may have had an issue where on reboot the ADSL interface is not correctly initialized. Hence whilst the config will work, it won’t survive a reboot. The workaround is to put in a fake routed connection to a non-existent VPI/VCI combination. I’m running and it does not suffer from this issue.

Step 4:

Connect the TG582n to the Mikrotik and the ADSL line. Number the raw ethernet interface (I used port 2 but I don’t think it matters) to be on the same subnet as (the Mikrotik).

Step 5:

In order for the TG582n PPTP connection to connect successfully, the session must have a phone number entry of ‘vpi0vci38‘. It is not obvious how to do this with a Mikrotik, but here’s how. Go to the PPP menu, click ‘Profiles’ and add a new profile (I couldn’t get anything but a name of ‘Profile1‘), and under ‘Address List’, enter ‘vpi0vci38‘.

It should look like this:
Mikrotik PPP panel
Mikrotik PPP panel

Step 6:

Create a new PPTP client interface (I called it ‘pptpsky‘). Make it ‘connect to’, and use the above profile (where it says ‘default‘ select ‘profile1‘ instead). Do not select ‘Add default route’ as Sky sends a bogus default route (or more accurately seems to negotiate a bogus endpoint), just to be unhelpful. Use the credentials of your ISP, i.e. username ‘‘ and ‘password‘.

As the PPTP IP connection runs between the Mikrotik and the TG852n only, you can set the MTU and MSS to 1500 bytes quite safely. Though you’ll get fragments over the half metre of ethernet cable, you will be able to move full unfragmented ethernet frames over the WAN.

It should look like this:
Mikrotik Interface panel
Mikrotik Interface panel

Hopefully the connection should now go into a ‘running’ state.

Step 7:

Insert a manual static default route, using route ‘‘ and for gateway enter the interface name of the PPTP interface. In my case that was ‘pptp-sky‘. No, it’s not obvious you can do this, and yes I thought it wanted an IP address too.

It should look like this:
Mikrotik Route panel
Mikrotik Route panel

That was sufficient to get my connection running again. Good luck!

Nominet has a consultation out on changes to its registrar agreement. In short, this one isn’t too bad. The big news is it proposes splitting registrars into three tiers: self-managed, channel partner, and accredited channel partner. I think the first category is too restrictive and would actually save Nominet a lot of bother if it was widened to include (basically) anyone registering domains as a non-commercial activity, given that domains registered for others still need to comply with the channel partner rules anyway. I think the lack of policing for accredited partners when combined with some of the extra powers they are given is a not a good move. Other than that, apart from a few nits, it seems OK. My response is here.

I’ve written before (here, here and here) on Nominet’s proposals for registrations in the second level domain. That means you can register rather than Superficially that sounds a great idea, until you realise that if you have already registered you’ll either have to register (if you even get the chance) and pay Nominet and a registrar a fee for doing so, or be content with someone else registering it. This is a horse Nominet continues to flog, no doubt due to its obstinate refusal to die quite yet.

I encourage you to respond to Nominet’s second consultation on the matter (they’ve made that pretty easy to do).

My view is that this is not a good idea. You can find a PDF copy of my response here. If you prefer reading from web pages, I’ve put a version below.

 A.         Executive Summary of Response

This document is a response to the consultation document entitled “Consultation on a new .uk domain name service” published by Nominet in July 2013. It should be read in conjunction with my specific responses to the questions asked within the document. Numbering within section B of this document corresponds to the section numbering within Nominet’s own document.

The proposals to open up .uk for second level registration remain one of the least well thought-out proposals I have yet to read from Nominet. Whilst these proposal are less dreadful than their predecessors, they remain deeply flawed and should be abandoned. The proposals pay insufficient attention to the rights and legitimate expectations of existing registrants. They continue to conflate opening up domains at the second level with trust and security. They represent feedback to a one-sided consultation as if it were representative. And, most importantly, they fail to demonstrate that the proposals are in the interest of all stakeholders.

I hereby give permission to Nominet to republish this response in full, and encourage them to do so.

B.         Answers to specific questions

The following section gives answers to specific questions in Nominet’s second consultation paper. Areas of text in italics are Nominet’s.

Q1.             The proposal for second level domain registration

This proposal seeks to strike a better balance between the differing needs of our stakeholders and respond to the concerns and feedback raised to the initial consultation. We have ‘decoupled’ the security features from the proposal to address concerns regarding the potential creation of a ‘two tier’ domain space and compulsion to register in the second level. We have set out a more efficient registration process to enhance trust in the data and put forward an equitable, cost effective release mechanism. 

Q1.a             Do you agree with the proposal to enable second level domain registration in the way we have outlined?

No, I do not agree with the proposal to enable second level domain registration as outlined.

Q1.b             Please tell us your reasons why.

The reasons I do not agree with the proposal to enable second level domain registration as outlined are as follows:

In general, no persuasive case has been made to open up second level domain registrations at all, and the less than persuasive case that has been put fails to adequately weigh the perceived advantages of opening up second level domain registrations against the damage caused to existing registrants. In simple terms, the collateral damage outweighs the benefits.

Whilst it is agreed that domain names are not themselves property, in many ways they behave a little like property. Domain name registrants, whether they are commercial businesses, charities or speculators, invest in brands and other material connected with their domain names. The business models of some of these participants (be they arms companies, dubious charities or ‘domainers’) may or may not be popular with some, but a consultation purporting to deal with registration policy should not be the forum for addressing that. Like it or not, these are all people who have invested in their domain name and their brand around that domain name on the basis of the registration rules currently in place. Using the property analogy, they have  built their house upon land they believed they owned. Nominet here is government, planning authority and land registry rolled into one, and proposes telling the domain name owners that whilst they thought they had bought the land that they have, now others may be permitted to build on top of them – but no matter, Nominet will still ‘continue to support’ their now subterranean houses. And for the princely sum of about twice what they are paying Nominet already, they may buy the space above their existing home. Of course this is only an option, and living in the dark, below whatever neighbour might come along is an alternative. In any other setting, this would be called extortion.

Of course there are undoubtedly good reasons to open up second level domains; were we able to revisit the original decision made when commercial registrations were first allowed in .uk, second level domains would probably not exist. However the option to revisit that decision is not open to us. Therefore, to change those registration rules Nominet needs a very good reason indeed; a reason so strong, and so powerful that it trumps the rights and legitimate expectations of all those existing registrants. No such reason has yet been presented.

Nominet claims in the introduction to its second consultation paper “It was clear from this feedback [on its first consultation] that there was support for registrations at the second level”; it does not say whether this support outweighed the opposition, and the full consultation responses have never been published. In the background paper Nominet says “The feedback we received was mixed”. In the press release after the first consultation, Nominet said “It was clear from the feedback that there was not a consensus of support for the proposals as presented”. Nominet’s initial consultation document only told side one of the story; it presented the advantages of opening registrations at the second level without putting forward any of the disadvantages. It is therefore completely unsurprising that it found favour with some respondents particularly those unfamiliar with domain names who would not be able to intuit the disadvantages themselves, rather like a politician asking voters whether they would like lower taxes without pointing out the consequences. The second consultation is little better – nowhere does it set out the disadvantages of the proposal as a whole to existing registrants. Given this, it is remarkable how much opposition the proposal has garnered. I have yet to find anyone not in the pay of Nominet that supports this proposal, and it has managed to unite parts of the industry not normally known for their agreement in a single voice against Nominet.

For over 20 years registrations have been made in subdomains of .uk, and since 1996 that process has been managed by Nominet. Nominet claims to be a ‘force for good’ that seeks to enhance trust in the internet. Turning its back on its existing registrants that have single-handedly funded its very existence seems to me the ultimate abrogation of that trust.

The remainder of my comments on this consultation should therefore be read in the context that the best course of action for Nominet would be to admit that in this instance it has made an error, and abandon this proposal in its entirety.

Q2.             Registration process for registering second level domains

We believe that validated address information and a UK address for service would promote a higher degree of consumer confidence as well as ensure that we are in a better position to enforce the terms of our Registrant Contract. We propose that registrant contact details of registrations in the second level would be validated and verified and we would also make this an option available in the third levels that we manage. 

2.a Please tell us whether you agree or disagree with the proposed registration requirements we have outlined, and your reasons why. In particular, we welcome views on whether the requirements represent a fair, simple, practical, approach that would help achieve our objective of enhancing trust in the registration process and the data on record.

Validation of address information and indeed any proportionate steps that increase the accuracy of the .uk registration database are desirable. However, this is ineffective for the desired purpose (increasing consumer confidence), and in any case there is no reason to link it only to registrations at the second level.

Nominet’s logic here is flawed. Would validation of address information and a compulsory UK address for service promote a higher degree of consumer confidence? I believe the answer to this is no, for the following reasons:

Firstly, the fact that a domain name has a UK service addresses (which presumably would include a PO Box or similar) does not, unfortunately, guarantee that the content of the web site attached is in some way to be trusted. All it guarantees is that the web site has a UK service address. Web sites can contain malicious code whether placed there by the owner or by infection. Web sites with UK service addresses can sell fraudulent goods. Web sites with UK service addresses can turn out not to be registered to the person the viewer thought that they might be (see Nominet’s DRS case list for hundreds upon hundreds of examples). Nominet has presented no evidence that domain name registrations with UK service addresses are any less likely to carry material that should be ‘distrusted’.

Secondly, the registration address for a domain name is not easily available to the non-technical user using a web browser. Nominet appears to be around 15 years out of date in this area. Consumers increasingly do not recognise domain names at all, but rather use search engines. The domain name is becoming increasingly less relevant (despite Nominet’s research) as consumers are educated to ‘look for the green bar’ or ‘padlock’. This is a far better way, with a far easier user interface, to determine whether the web site is registered to whom the user thought it was. It is by no means perfect, but is far more useful than Nominet’s proposal (not least as it has international support). Nominet’s proposal serves only to confuse users.

Thirdly, the concept that UK consumers would be sufficiently sophisticated to know that domain names ending .uk had been address-validated (but not subject to further validation) unless those domains ended with, etc. is laughable. The user would have to know that is not address validated, but that would be address validated, which would require the average internet user memorising the previous table of Nominet SLDs. If Nominet hopes to gain support for address validation, it should do it across the board.

Fourthly, this once again means that existing registrants would be disadvantaged. By presenting (probably falsely) registrations in the second level as more trustworthy, this implies registrations at the third level (i.e. all existing registrations) are somehow less trustworthy, or in some way ‘dodgy’.

Nominet presents two other rationales for this move. Nominet claims it can enforce its contract more easily if the address is validated. This is somewhat hard to understand. Firstly, is Nominet not concerned about enforcing its contracts for other domain names? Secondly Nominet should insist on a valid address for service (Nominet already pretty much does this under clauses 4.1 and 36 of its terms and conditions). If the service address is invalid, Nominet can simply terminate its contract. Thirdly, a UK service address seems a rather onerous responsibility in particular for personal registrations, such as UK citizens who have moved abroad.

Nominet also suggests such a process would ‘enhance trust in the data on record’. This is a fair point, but should apply equally to all domain names. It is also unclear why having a foreign company’s correct head office address (outside the UK) would not be acceptable, whereas a post office box to forward the mail would be acceptable.

Q3.             Release process for the launch of second level domain registration

The release process prioritises existing .uk registrations in the current space by offering a six month window where registrants could exercise a right of first refusal. We believe this approach would be, the most equitable way to release registrations at the second level. Where a domain string is not registered at the third level it would be available for registration on a first-come, first-served basis at the start of the six month period or at the end of this process, if the right of first refusal has not been taken up. 

Q3.a            Please tell us your views on the methodology we have proposed for the potential release of second level domains. We would be particularly interested in your suggestions as to whether this could be done in a fairer, more practical or more cost-effective way.

The release mechanism proposed is less invidious than the previous scheme in that it gives priority to existing registrants. This change is to be welcomed I suppose, though is not a substitute for the correct course of action (scrapping the idea of opening the second level up at all).

The remaining challenge is how to deal equitably with the situation where two different registrants have registrations in different SLDs. The peaceful coexistence of such registrants was facilitated by the SLD system, and opening up .uk negates that facilitation. The current proposals give priority to the first registrant. This has the virtue of simplicity.

I have heard arguments that this penalises owners, who are likely to have spent more building a brand. In particular, it is argued, this penalises owners of two letter owners, as these were released after two letter domains were released in other domains (handled under Q3.b below). To the first, the counter argument is that to prefer penalises owners (for instance); no doubt the minute there is speculation that Nominet might prefer owners, there will be an active market in registering names in that are only registered in

Q3.b             Are there any categories of domain names already currently registered which should be released differently, e.g. domains registered on the same day, pre-Nominet domains (where the first registration date may not be identified with certainty) and domains released in the 2011 short domains project?

I see no merit in treating pre-Nominet domain names differently provided the domain name holder has accepted Nominet’s terms and conditions.

I see no merit in treating domain name registered on the same day as different, provided Nominet can still ascertain the order of registration.

If Nominet cannot ascertain the order of registration, I would inform each party of this and invite evidence. If after admitting evidence Nominet still could not determine which registration was first, I would either allow an auction or choose at random.

With respect to the short domains project, I would argue Nominet has dug its own grave. Just like all of its registrants before, Nominet did not predict it was to open up .uk. For consistency, it could re-auction two letter domains in .uk. However, a simpler, fairer, and more equitable result would be to not open up .uk at all.

Q3.c            We recognise that some businesses and consumers will want to consider carefully whether to take on any potential additional costs in relation to registering a second level domain. Therefore we are seeking views on: 

  • Whether the registrant of a third level domain who registers the equivalent second level should receive a discount on the second level registration fee;
  • Developing a discount structure for registrants of multiple second-level .uk domains;
  • Offering registrants with a right of first refusal the option to reserve (for a reduced fee) the equivalent second level name for a period of time, during which the name would be registered but not delegated. 

Please tell us your views on these options, or whether there are any other steps we could take to minimise the financial impact on existing registrants who would wish to exercise their right of first refusal and register at the second level.

These proposals risk introducing excess complexity. The most equitable path would be not to open up registrations at the second level at all.

If, despite all objections, the second level is opened up, it is vital that the interests of existing registrants are protected. A simple and fair way of achieving this would be to allow any existing registrant (and only existing registrants) the registration of their .uk second level domain for free for four years (or failing that at a very substantial discount to the existing prices in third level domains). As this would be a single registration to an existing registrant for a single period the marginal costs would be low. This would be sufficient time for the registrant to change stationery, letterhead etc. in the normal course of events. This should be permitted through registrars other than the registrant’s existing registrar to encourage competition. Save for the altered price, this registration would be pari passu with any other.

Q4.            Reserved and protected names

We propose to restrict the registration of <> and <> in the second level to reflect the very limited restrictions currently in force in the second level registries administered by Nominet. In addition, we would propose to reserve for those bodies granted an exemption through the Government’s Digital Transformation programme, the matching domain string of their domain in the second level.

4.a            Please give us your views on whether our proposed approach strikes an appropriate balance between protecting internet users in the UK and the expectations of our stakeholders regarding domain name registration. Can you foresee any unintended complications arising from the policy we have proposed?

This is one of the stranger proposals from Nominet.

In essence a government programme (internal to the government) has removed the right of certain organisations to register within is not administered by Nominet. I fail to see why those organisations that turned out to be on the wrong side of a government IT decision should have any special status whatsoever, especially when compared to registrants that have been Nominet’s customers for many years. I notice Nominet’s consultation does not even offer any rationale for this.

One example is which is the site of The Independent (a UK newspaper). However, does not seem to be active. This is perhaps the most obvious example, but there are no doubt others. There is simply no reason why those ejected from should have preferential treatment over domain name holders in .uk. At the very most, they should be given secondary status after existing domain name holders, but I fail to see why they can’t take their chances in the domain name market like any other organisation.

I am afraid this proposal smells like Nominet pandering for support from government for its otherwise unpopular proposal.

Q5.            General views

Q5.a            Are there any other points you would like to raise in relation to the proposal to allow second level domain registration?

  1. Nominet should abandon its current proposals in their entirety. Nominet has failed to explain why the proposals in toto are in the interests of its stakeholders, in particular the registrant community (who after all will have this change inflicted on them). Unless there is a high degree of consensus amongst all stakeholder groups in favour of the proposal, it should be abandoned. I believe no such consensus exists.
  2. Nominet should disaggregate the issue of registrations within .uk and the issue of how to help build trust in .uk in general. I said before that Nominet should run a separate consultation for opening up .uk, as a simple open domain with the same rules as, and Nominet has failed to do this having retained different rules for validation, address verification and price. Both consultations conflate the issue of opening up the second level domain with issues around consumer trust (although admittedly the second consultation does this less than the first). Whilst consumer trust and so forth are important, they are orthogonal to this issue.
  3. Nominet should remember that a core constituency of its stakeholders are those who have registered domain names. If new registrations are introduced (permitting registration in .uk for instance), Nominet should be sensitive to the fact that these registrants will feel compelled to reregister if only to protect their intellectual property. Putting such pressure and expense on businesses to reregister is one thing (and a matter on which subject ICANN received much criticism in the new gTLD debate); pressurising them to reregister and rebrand by marketing their existing registration as somehow inferior is beyond the pale. Whilst the second proposal is less invidious than the first, it is still a slap in the face for existing .uk registrants.
  4. Nominet should recognise that there is no silver bullet (save perhaps one used for shooting oneself in the foot) for the consumer trust problem, and hence it will have to be approached incrementally.
  5. Nominet should be more imaginative and reacquaint itself with developments in technology and the domain market place. Nominet’s attempt to associate a particular aspect of consumer trust with a domain name is akin to attempting to reinvent the wheel, but this time with three sides. Rather, Nominet should be looking at how to work with existing technologies. For instance, if Nominet was really interested in providing enhanced security, it could issue wildcard domain validated SSL certificates for every registration to all registrants; given Nominet already has the technology to comprehensively validate who has a domain name, such certificates could be issued cheaply or for free (and automatically). This might make Nominet instantly the largest certificate issuer in the world. If Nominet wanted to further validate users, it could issue EV certificates. And it could work with emerging technologies such as DANE to free users from the grip of the current overpriced SSL market.
  6. There is no explanation as to why these domains should cost £4.50 per year wholesale rather than £5 for two years as is the case at the moment. If the domain name validation process is abandoned (as it should be) these domains should cost no more to maintain than any other. Perhaps the additional cost is to endow a huge fund for potential legal action? The increased charges add to the perception that the reason for Nominet pursuing opening domains at the second level is simply financial self-interest, rather than acting in the interests of its stakeholders.

Q5.b            Are there any points you would like to raise in relation to this consultation? 

To reiterate the point I have made before, this consultation and its ill-fated predecessor fail to put their points across in an even handed manner. That is they expound the advantages of Nominet’s proposal, without considering its disadvantages. That is Nominet’s prerogative, but if that is the course Nominet takes then it should not attempt to present the results of such a ‘consultation’ as representative, as their consultees will have heard only one side of the story.

This article explains a little about how the timer system works in QEMU, and in particular the changes associated with the new timer system I’ve contributed to QEMU.

What to timers do?

Timers (more precisely QEMUTimers) provide a means of calling a giving routine (a callback) after a time interval has elapsed, passing an opaque pointer to the routine.

The measurement of time can be against one of three clocks:

  • The realtime clock, which runs even when the VM is stopped, with a resolution of 1000 Hz;
  • The virtual clock, which only runs when the VM is running, at a high resolution; and
  • The host clock, which (like the realtime clock) runs even when the VM is stopped, but is sensitive to time changes to the system clock (e.g. NTP).

Timers are single shot, i.e. when they have elapsed, they call their callback and will do nothing further unless they are re-armed. The callback can rearm the timer to provide a repeating timer.

What’s changed in the implementation?

Prior to the timer API change, timers only existed in the main QEMU thread, and were only run from QEMU’s main loop. Expiry of timers was through a system-dependent system of secondary timers called alarm timers. These would call QemuNotify which caused a write to a notifier FD, and caused any poll() in progress to terminate. Throughout QEMU, poll() (or equivalent) would use infinite timeouts (rather than the timeout associated with any timer), and rely on the alarm timers (under POSIX running through signals) to terminate these system calls.

This approach caused a number of problems:

  • Timers were only handled within the main loop. QEMU’s AioContext has an internal loop that runs during block operations and may run for a long time. Timers were not processed during this loop which made writing block device timers that would be guaranteed to run while the block layer was busy hard to do;
  • Timers were fundamentally single threaded, and the existing system was incompatible with plans for additional threading and AioContexts;
  • The system dependent code that implemented alarm timers was not pretty; and
  • The API to call them was messy.

To fix this I contributed a 31 commit patch that:

  • Refactored nearly all of the timer code;
  • Switched to rely on timeouts from ppoll() rather than signals and alarm timers (thus allowing alarm timers to be deleted);
  • Separated the clock sources (QEMUClock) from the lists of active timers (QEMUTimerList);
  • Introduced timer list groups (QEMUTimerListGroup) to allow independent threads or other users of timers to have their own set of timers (each attached to any clock); and
  • Tidied up the API.

You can find the patch set here.

How does the new implementation work?

The diagram below shows the relationship between the new objects (click to enlarge).

QEMU Timer Diagram
QEMU Timer Diagram

There is exactly one QEMUClock for each clock type, representing the clock source with that particular type. Currently there are three clock sources, as outlined above. In the previous implementation, each clock would have a list of timers running against it. However, now the list of running timers is kept in a QEMUTimerList. The clock needs to keep to track of all the QEMUTimerLists attached to that clock, and for that purposes maintains a list of them (purple line on the diagram).

Each user of timers maintains a QEMUTimerListGroup. This is a struct currently consisting solely of an array of QEMUTimerList pointers, one for each clock type. Hence each QEMUTimerListGroup is really a collection of three (currently) QEMUTimerLists (the red line in the diagram). There are two current QEMUTimerListGroup users. A global static, main_loop_tlg, represents the timer lists run from the main loop (i.e. all current timer users). I also added a QEMUTimerListGroup to each AioContext (there currently being just one), so the block layer can use timers that will run without relying on the aio system returning to the main loop; this also permits future threading. Any other subsystem could have its own QEMUTimerListGroup; it merely needs to call timerlistgroup_run_timers at an appropriate time to run any pending timers.

Each QEMUTimerList maintains a pointer to the clock to which it is connected (orange line). It is (as set out above) on a list of QEMUTimerLists for that QEMUClock (the list element). It contains a pointer to the first active timer. The active timers are not maintained through the normal QEMU linked list object, but are instead a simple singly linked list of timers, arranged in order of expiry, with the timer expiring soonest at the start of the list, linked to by the QEMUTimerList object (blue line).

Each QEMUTimer object contains a link back to its timer list (green line) for manipulation of the lists.

Under the hood, the call to g_poll within the AioContext’s poll loop has been replaced with ppoll (the nanosecond equivalent of poll). Unfortunately, glib only provides millisecond resolution, meaning that the main loop’s timing will only be at millisecond accuracy whilst this continues to use glib’s poll routines. Plans are afoot to remedy this.

What’s changed in the API?

The API had become somewhat messy.

Firstly, I’ve replaced the previous timer API (with function names like qemu_mod_timer) with a new API of the form timer_<action> where <action> is the action required, for instance timer_new, timer_mod, and so forth. The timer_new function and friends take an enum value being the clock type, rather than a pointer to a clock object. For instance this:

timer = qemu_new_timer_ns (vm_clock, cb, opaque);

becomes this:

timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, cb, opaque);

Timers not using the main loop QEMUTimerListGroup can be created using timer_new_tl. There’s also timer_init available for use without malloc which takes a timer list. AioContext provides helper functions aio_timer_new and aio_timer_init.

Secondly, I’ve similarly rationalised the clock API. For instance this:

now = qemu_get_clock_ns (vm_clock);

becomes this:

now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);

Thirdly, I’ve added lots of documentation (see include/qemu/timer.h)

If you have out-of-tree code that needs converting to the new timer API, simply run

scripts/switch-timer-api [filename] ...

to convert them.

Over the weekend I wrote a little ugly perl in order to solve the ‘four fours‘ problem.

Essentially the problem is to make as many positive integers as you can, using only four occurrences of the digit ’4′, and some mathematical symbols. For instance:

17 = ( 4  * 4 )  + ( 4  / 4 )

There are several different variants of the rules to this game, mostly concerning what types of symbol you are allowed. I let you control these with command line switches, and support:

  • Decimals (e.g. .4)
  • Recurring decimals (e.g. .4~)
  • Square roots
  • Power (e.g. 4 ** 4)
  • Factorial (e.g. 4!)
  • Multidigit numbers (or decimals), e.g. 44, 4.4

You can also specify a different number of fours or a different digit.

With the ‘christmas tree’ option turned on (i.e. use everything), it cannot find 17 integers between 1 and 200 (113, 141, 147, 157, 163, 166, 167, 171, 173, 177, 183, 185, 187, 191, 193, 197 and 199). My solution to 123 = ( 44 / .4~ ) + 4! is simpler than Wheeler’s sqrt(sqrt(sqrt((sqrt(4)/.4)^4!)))-sqrt(4), though you it will find that too if you switch off recurring decimals.

You can find the source code here. I do not claim that it is optimised or beautiful. In particular, the treatment of rounding is fairly disgusting. If only I could ‘use Surds;‘.

I suppose I have been quite critical of Nominet recently (here, here, here, here, here, here, here, and here, for instance). So I am pleased to say they’re suggesting something today with which I agree.

Today Nominet announced a ‘program for evolving the .uk namespace‘. Much of the release is to do with a new consultation on opening up .uk for registrations; I last blogged about that here. A new consultation is out on July 1, and from the announcement it sounds like far less of a dog’s breakfast than the last consultation. However, I’m going to hold fire on that until the consultation document comes out, and comment on something likely little noticed below it under the heading ‘changes to Nominet’s governance’:

Nominet is proposing changes to Nominet’s Board structure, which will be decided by Members at the company’s forthcoming AGM.  These are:

  • Creating an additional seat for the executive on the Board to bring this in line with a new senior structure. If approved, this will mean giving the executive with responsibility for member and channel engagement and business development – the Chief Commercial Officer – a Board level position.
  • Extending the term for member-elected Non-Executive directors to 3 years from 2014 – bringing this into line with the 3-year terms for appointed Non-Executive directors.
  • Removing the 6-year limit on non-executive director terms in order that the company can, if appropriate, continue to benefit from accumulated experience and continuity.

Full details of proposed changes will be included in the AGM voting packs later this week.

Here’s why I agree with these proposals:

Re the first proposal, Nominet has a board of 10 people. These are made up of 3 executives (including the CEO), 4 member-elected non-execs (in whose election the board takes no meaningful role), and 3 appointed non-execs (appointed in the normal way by a nominations committee, but subject to member approval); the last category includes the chair of the board. Arguably 10 is a lot, but Nominet is a fair size company and board responsibilities are increased by Nominet’s unique position and public service role. This means non-executive work-loads are heavier than in any other company I know (I did 11 years at Nominet as a non-exec, and have served as a non-exec for several other companies including a quoted main market company); I suspect having fewer non-executives would therefore be undesirable, as the work-load would increase further, and if non-execs find a substantial proportion of their working week is dedicated to one company (and most of their income sourced from there), they tend to lose their independence. Having only 30% of your board as executive members is unusual, especially for a private company in the UK. I can’t comment on the individual involved (having never met her), but bringing onto the board an executive with responsibility for ‘member and channel engagement’ helps rebalance this, and should help Nominet rebuild bridges with a membership from which it has recently seemed a little detached on occasion.

Re the second proposal, the current term for member-elected non-executives is 2 years. Having watched a number of non-executives join the board of Nominet, I can tell you it takes at least a year before the non-exec comes up to speed. Every non-exec I have asked agrees with this point. This means that even if the best possible non-exec is chosen by the members, for half their term they are ineffective. Moreover, board members from other constituencies do not suffer from this problem. I’ve been a supporter of three year terms for a long while, though I know this is a view far from universally shared. I’m very pleased Nominet has come around to this point of view, and using people’s time and skills better (whatever their views) seems eminently sensible.

I suspect the third will be the controversial proposal. Currently non-executive directors of both varieties are limited to a 6 year continuous term. Term limits normally exist as there is a perceived risk of people ‘going native’, i.e. losing the independence that they are meant to bring to a board. I’ve always been against term limits for member-elected non-executives – the end of my 11 year term preceded the introduction of term limits, so that’s not why! The reason is simply that the point of member elected non-execs (particularly now we also have appointed non-execs) is that the membership should be able to elect who they want. If Janet is particularly good at holding the board to account, much to the members’ satisfaction, why should the membership not be allowed to have Janet serving for more than 6 years? I quite understand why the members might not want Janet back (perhaps they’d feel she’d become too cosy), but to prohibit it in the articles seems perverse. Moreover, member-elected non-executives are not required to be independent anyway in the normal sense; they often have relationships with customers. So I am glad it is being removed in relation to member-elected non-executives. I am less sure about removing the term limit in respect of appointed non-executives, given that the recommendation for re-election comes from the board itself. However, on balance I think it’s correct to remove the term limit. The membership get the final vote after all, and if both board and members want a non-exec to stay on, why prohibit it? Rather, I think what the board should do is comply with the UK Corporate Governance Code (warning: broken MIME type) published by the FRC (which is meant for listed companies). I think these are a useful set of criteria to adapt, and for the nominations committee to consider in relation to appointed non-executives. In particular, I think appointed non-execs should qualify as ‘independent’, and thus the nominations committee, if recommending reappointment of an appointed non-exec that has already served six years, should explicitly justify why they consider that non-exec is still independent. I think this ‘comply or explain’ model is sufficient, given members have the final say.

So I will be casting my vote (for what it’s worth) in favour of these changes at the AGM.

Edit: it’s been pointed out to me that whilst B.1.1.1 talks about nine years, B.2.3 talks about six years, and says ‘Any term beyond six years for a non- executive director should be subject to particularly rigorous review’, so I’ve added this below. I think this actually strengthens my point, as the code considers the point, but says it is for the board (or its nominations committee) to make the call, and not the articles. I hope Nominet confirms it will be doing this ‘rigorous review’ in respect of appointed non-execs.

Extract from the UK Corporate Governance Code:

B.1.1.1 The board should identify in the annual report each non-executive director it considers to be independent. The board should determine whether the director is independent in character and judgement and whether there are relationships or circumstances which are likely to affect, or could appear to affect, the director’s judgement. The board should state its reasons if it determines that a director is independent notwithstanding the existence of relationships or circumstances which may appear relevant to its determination, including if the director:

  • has been an employee of the company or group within the last five years;
  • has, or has had within the last three years, a material business relationship with the company either directly, or as a partner, shareholder, director or senior employee of a body that has such a relationship with the company;
  • has received or receives additional remuneration from the company apart from a director’s fee, participates in the company’s share option or a performance-related pay scheme, or is a member of the company’s pension scheme;
  • has close family ties with any of the company’s advisers, directors or senior employees;
  • holds cross-directorships or has significant links with other directors through involvement in other companies or bodies;
  • represents a significant shareholder; or
  • has served on the board for more than nine years from the date of their first election.

B.2.3 Non-executive directors should be appointed for specified terms subject to re-election and to statutory provisions relating to the removal of a director. Any term beyond six years for a non- executive director should be subject to particularly rigorous review, and should take into account the need for progressive refreshing of the board.

This post is the continuation of my answer to the elves, cakes and gremlins puzzle.

One remaining question is the validity of the assumption that its quick and easy to find a value of S that is coprime to the value of L.

I blithely asserted this easy, as the probability that two numbers are coprime is 6/π2. Whilst this is true (here‘s a proof and some other interesting information on coprime – or relatively prime – numbers), it’s only true if you pick both numbers randomly. In fact, for certain values of L the number of coprime numbers smaller than it is far smaller.

Euler’s totient function \(\phi(n)\) is the number of coprimes of an integer n smaller than n. So, with L lockers there are only \(\phi(L)\) distinct locker search permutations we can pick. With a maximum of E elves searching at once, we know that the probability of two using the same locker search sequence is \(E / \phi(n)\), and whilst we know \(E \ll L\), we do not know \(E \ll \phi(L)\). Further, we hunt for coprimes by taking a random guess S, and incrementing modulo L-1 until we have a coprime; if L is huge, how do we know we won’t be looping for a very long time?

The formula for Euler’s totient function can be expressed as follows:

\(\phi(n) = n \prod_{p|n} (1 – \frac 1 p)\) where \(p\) is prime

Is there a lower bound for \(\frac {\phi(n)} n\)? Clearly this is at its lowest where n has the largest number of prime factors, i.e. primorials. It turns out there is no constant lower bound on this ratio, but a lower bound can be given by (for \(n \ge 3\)):

\(\phi(n)\geq \frac{n}{e^{\gamma}\log \log n}+O\left(\frac{n}{(\log \log n)^2}\right)\)

where \(\gamma\) is the Euler-Mascheroni constant, i.e. about 0.57721.

Now, \(\log \log(n)\) is pretty small. With 30 lockers (a primorial and thus meeting the lower bound), \(\phi({30}) = 6\), and \(\frac {\phi(n)} n\) = 0.2.  Where we have around \(2^{256}\) lockers, the lower bound on \(\frac {\phi(n)} n\) is still about 0.1, i.e. at least one tenth of the numbers less than n (where n is around \(2^{256}\)) are coprime to n. This is good news as it suggests there are always going to be a good number of lockers to pick, even when the number of lockers approaches the number of atoms in the universe (\({10}^{80}\)). In other words we have a similar sort of probability of SHA-256 hash collisions, and as the net result of such a collision would be slightly less efficient searching, we can relax.

However, we still don’t know there aren’t vast tracts of non-comprime values of \(S\) which are adjacent. If that were the case, finding a coprime value of \(S\) might (occasionally) itself take a long time, and because we hunt through the space in order, we might still end up with lots of elves using the same value S. For instance, if the coprimes of \(L\) were all located in the bottom 5% and top 5% of the numbers between \(0\) and \(L\), then 90% of the time we’d have an expensive hunt through a range barren of coprimes, and end up with the first coprime in the top 5%. Intuitively this seems unlikely as the coprimes will be evenly distributed, especially given the pretty picture here. But can we prove it? I’m still thinking about this one. The maximum length of a sequence of integers smaller than \(N\) and non-coprime to \(N\) is given by Jacobsthal’s function (or more accurately the Jacobsthal function less one), and seems to be no more than about \(2 \log (N)\), and the average sequence length in practice much smaller – far smaller than \(\log \log(N)\). Suggestions on how to prove this appreciated1.

An alternative route (which is I think slower, but given a true random number generator would be fine), would be to simply pick \(S \in \mathbb{Z^+}\) in the range \(0 \lt S \lt L-1\), and repeat until \(\gcd (S, L) = 1\). As there are are \(\phi(L)\) coprimes to \(L\), this will succeed with probability

\(p = \frac {\phi(L)} {L-1}\)

which means one can expect success in:

\(E(n) = \frac 1 {p} = \frac {L-1} {\phi(L)}\)

iterations. Where \(L\) is large, \(L \approx L-1\), hence we can use the simpler estimate:

\(p = \frac {\phi(L)} {L}\)

which leads us to

\(E(n) \leq {e^{\gamma}\log \log L}\)

We know for values of lockers smaller than \(2^{256}\), we would expect fewer than 11 iterations (as per the above calculation).

So, for those worried about my unproven number theory, an alternative algorithm is set out below, which bails out if it can’t find a coprime step after 100 attempts (which should happen less than 0.002% of the time with even with  \(2^{256}\) lockers, though it makes me feel safer in case of random number generator problems) and falls back to using a step of 1. In pseudo-C again:

int placeCake(int L)
    const int maxsteps = 100;
    int s = 1; /* guaranteed to be coprime */
    int n;
    int c;

    /* Repeat until we get a step coprime with L
     * Bail out after maxsteps steps and just use 1.
     * Probability analysis says this should almost
     * never happen, and the result is marginally
     * less efficient placement
    for (c = 0; c < maxsteps; c++)
        /* Pick a random step between 1 and L-1 inclusive */
        n = 1 + rand() % (L-2);
        if (gcd (n, L) == 1)
            s = n; /* use this if it's coprime */

    /* now we know the step is coprime to L */

    /* pick a random first locker */
    n = rand() % L;

    /* L iterations covers all lockers as gcd(L,S)=1 */
    for (c = 0; c < L; c++)
        if (!placeCakeIfEmpty(n))  /* returns 0 on success */
            return n;
        n = (n+S) % L;

    return -1;


Note 1: A late edit. It appears that if \(j(N)\) is the Jacobsthal function, then the upper bound:

\(j(N) \ll \log \log (N)\)

was proved by: H. Iwaniec (On the problem of Jacobsthal, Demonstratio Math. 11, 225–231, (1978) – seemingly unavailable online). So it would appear my first solution is not time bounded any worse than the second. As \(e^{\gamma}\gt 1\), that means the expected value using the second routine is greater than the maximum value using the first. This would imply the first routine (given in the previous post) is provably always faster.