# Owen Shepherd

Physicist. Software Developer. Many things. (CV)

I develop the Public Domain C Library and Impeller (a client for pump.io) among other things, as well as contributing to projects such as Ogre3d.org.

Replied to a post on werd.io:

@benwerd SIL OFL is pretty standard for fonts. Very liberal, do what you want, just mandatory renaming if you make a derivative. Only slightly tricky case is if you're building your own WOFF or such, in which case your WOFF needs to bear a different name (derivatives clause)

Replied to a post on werd.io:

@benwerd I did that too last time my MacBook needed service. I was somewhat disappointed at the lack of looks of disapproval...

Replied to a post on werd.io:

@benwerd Nowhere, I expect. They discontinued production and discounted them a few months before N5 release, so I imagine all you'll be able to find are second hand ones.

Replied to a post on werd.io:

DPD in the UK do that. One hour delivery window, live tracking of the van your package is on, etc.

## Hashcash, Pingbacks, Webmention, Proof of Work and the #indieweb

Spam, spam, ubiquitous spam. Who likes it? Nobody. Everyone is aware of the issues of E-Mail spam; anybody who has a blog is aware of the problem of pingback spam

Pingback is an old technology, based upon XML-RPC (which should date it somewhat!) . Designed in simpler times, saying that the percentage of pingbacks which are spam today is 100% is accurate to a large number of significant figures.

About a year ago, the indieweb community looked at Pingback and asked the question "Why is this based on XML-RPC?" They had a point; Pingback is a single function, with two URL parameters, which hardly needs a complex RPC system, and so they developed WebMention, which takes Pingback and distills it to a simple HTTP POST of your standard URL-encoded-form-data.

While WebMention solves the problem of complexity, it doesn't solve the problem of Spam (in a long term manner at least - at present, the small deployment means it isn't a spam target). However, today I provide a method for combating spam in WebMention: hashcash

## Hashcash?

Hashcash is a mechanism originally proposed for combating spam in E-Mail. The idea is relatively simple: the sender must provide a value which, when combined with certain values from the E-Mail, evaluates to a hash where the last $$n$$ digits are zero. Its' called a "proof of work" system because that's what it provides: proof that the sender has done work. Specifically, on average it will take $$2^{n-1}$$ attempts to find a hash in which the last $$n$$ bits are zero.

This is useful: It lets us tune the work over time. Its' asymmetric: it puts all the effort on the sender; rate limiting them by their computational power (because verifying that the hash ends with the correct number of bits only takes one hash calculation). Every time we increase $$n$$ by one, we double the computational power that the spammer needs to send out the same quantity of spam

Hashcash isn't the perfect protocol, of course; it doesn't (and nothing will) completely eradicate spam. It does, however, provide a mechanism to reduce it; placing less load on more intensive spam filtering systems (such as the traditional Bayesian filters)

## Hashcash for WebMention

The proposed mechanism of implementing Hashcash for WebMention is as follows. For the sender side:

1. The sender builds the WebMention request body, in URL encoded format.
2. The sender computes the SHA-256 hash of the request body
3. If the hash contains the required number of trailing zero bits, then the mention is ready. Otherwise, a "nonce" value is added to the request and the hash is retried.
4. The client then submits the request to the destination's WebMention endpoint
5. If the hash meets the recipients' criteria (or the recipient implements WebMention 0.1/0.2, and therefore does not perform hashcash processing), then the recipient should return a success result code. Otherwise, the recipient should return a HTTP 403 error, with properties error=bad_hashcash&required_length=N, where N is the number of trailing zero bits required
6. If the recipient requires more zero bits, then the sender may choose to retry with more zero bits (but should verify that the number is reasonable - a required length of 32 is unlikely to complete in reasonable time and could comprise a DOS attempt)

For the receiver side, the process is much simpler:

1. Take the SHA-256 hash of the request body
2. If the hash does not meet the required number of trailing zero bits, return a HTTP 403 error with properties error=bad_hashcash&required_length=N. The endpoint must, at a minimum, support returning errors in application/x-www-url-form-encoded format.
3. If the hash does meet the required number of trailing zero bits, continue with the rest of the WebMention validation.

## Considerations

The hash length should be chosen to be reasonable; a good starting point might be 22 bits. The maximum permitted length (for a sender) should also be chosen to be reasonable; 28 bits seems like a reasonable limitation for the time being. 20 bits is a cost of 2MHash on average; recent CPUs are capable of 20Mhash/s. Information about the present cost of a hash computation can be approximated by looking at the statistics for BitCoin; BitCoin uses SHA-256 so is comparable

Of course, implementers may wish to be more lenient as a HashCash-type system is phased in

@benwerd I noticed that on your idno install you've set the homepage to be your profile page. How did you do that?

Replied to a post on werd.io:

@benwerd It's certainly a very useful desire from the perspective of propagating religion. Which way the association originated is an interesting question.

Replied to a post on twitter.com:

@stevestreeting The lack of hubs and KVMs (and where they exist, the expense) are painful too.

## We have met the enemy – and he is us

The conventional assumption in computer security was that our primary adversaries were criminals, miscreants, and the security services of our “political foe”. Attacks were liable to be active and involve the exploitation of vulnerabilities in our systems, because such foe were unlikely to be able to access . On the basis of these assumptions and allegations, there were comments from people such as the US government proposing bans upon the use of communications equipment by Chinese companies Huawei and ZTE in the US.

In the light of the PRISM and TEMPORA revelations, the hypocrisy was deafening. In the light of the latest revelations, its’ hard to find a source of humor at all.

We have met the enemy, and he is us; our governments have invaded our communications infrastructure, have access to the data behind our services, and have installed backdoors in our software.

If we are to preserve privacy on the web, the need for change has never been greater.

I think it can be said that, at an infrastructure level, our immediate priorities should be:

• Deploying TLS v1.2 with Perfect Forward Secrecy
Older versions of TLS have less effective PFS cipher-suites, and often require undesirable tradeoffs. When dealing with TLS 1.0, we are often required to judge between the weaknesses of RC4 (no longer considered secure when used as in TLS/SSL) and potential vulnerability to the BEAST attack. It’s safe to say that TLS 1.0, as deployed, is fundamentally broken; while the protocol itself is not completely so, the cipher-suites deployed in the wild all have issues. Deploying a patch to TLS 1.0 would be as difficult as updating to TLS 1.2; an action we should be doing anyway
• Deploying DNSSEC, DANE and Certificate Pinning
Given the known extensive partnerships of the secret services, I think it is fair to say that the CA model has outlived its’ usefulness. Browsers ship hundreds of security certificates; it is sheer naivety to assume that none of them have been compromised. I don’t see this as the complete end for CAs; they can continue to provide utility in the form of extended validation services, but their numbers will be reduced and, importantly, it will no longer be able to trust every CA to ensure the trust in a domain.DNSSEC itself poses issues; it has a single root of trust (the IANA), and it is a requirement that we trust our domain registrar as part of the chain of trust to ourselves. However, it vastly reduces the number of moving parts and trusted authorities, and makes validating that trust significantly easier. Attacks against DNSSEC need to be narrowly targetted to be effective; comparison of DNSSEC signed zones across multiple machines provides a simple method of watching for suspicious behavior.
The other result of DNSSEC is that it makes deploying encrypted services easier. Anything we can do to increase the proportion of encrypted traffic on the internet can only be a good thing.
• Development of TLS v2?
The existing TLS specification is a gradual evolution of SSL. It is therefore old, battle tested and, as a protocol, a well known quantity. As a side effect, it is also complex; it contains many misfeatures, and our evolving understanding of cryptography points out many parts of TLS which are at the very least suboptimal, or often highly problematic. Mitigating its’ many design flaws has resulted in huge increases in the complexity of the codebases implementing TLS; large portions now need to run in constant time to avoid timing-based side channel attacks. NSS and OpenSSL are monstrous, often convoluted code bases. Validating them is troublesome; performing constant time cryptography on modern processors is increasingly difficult, identifying potential ways in which this can become non-constant-time is nigh impossible. Given all we know today, I think it’s time to take a step back from TLS, take a hard look, and for every known issue, misfeature or design problem, fix it. TLS 2.0 need not have more in common with TLS 1.2 than an the initial hello packets; it should be built on modern, known good primitives; PCKS-#1 v1.5 padding wholly unsupported, replaced by OAEP; authenticated encryption used where possible, encrypt-then-MAC where not.The chipersuite list should be slimmed down; current “good practice” ciphers like RSA and AES and authenticated encryption modes like AES-GCM can stay, but state of the art ciphersystems like Salsa/20 and Curve 25519 should also be included and recommended. While it is likely that the NSA has cryptanalysis knowledge we do not, we do not know of any near-realistic attacks in the current good practice ciphers, and the nature of cryptanalysis says that their capabilities are unlikely to bring them close to breaking either. Even cryptanalytic attacks considered “groundbreaking” rarely result in practically exploitable flaws.

Instead, we should be changing because the NIST-sanctioned cipher-suites are not designed with us in mind. Constant time implementations of AES in software are notoriously difficult, and this goes double for systems like AES-GCM. SHA-3 contains many primitives which are efficient to implement in hardware, but difficult in software; the NIST elliptic curves use constants which result in inefficient software implementations (neither mind not providing rationale for the choice of said constants).

Cryptographic algorithms designed with software in mind reduce the cost of implementing it, increasing security, while also reducing the number of corner cases and attacks which can result in information leakage on common hardware. They additionally reduce the performance delta against a well equipped adversary (such as a government security agency).

The prime objective of a TLS revision should be that a TLS2 implementation be relatively concise and easy to validate, without the inscrutable complexity of TLS1. The TLS2 paths in OpenSSL and NSS should not be as convoluted and twisty as those for TLS1, and the preferred algorithms should be those which do not require large tables or similar constructions liable to suffer from side channel attacks.

A good rule of thumb is that the maths is hard to subvert; the code is easy. To that end, we should push towards simpler protocols which are easier to analyze. We should also push towards better defaults from the libraries we use; it is ridiculous that OpenSSL, for example, doesn’t come secure by default.

## Looking to the future – Buffers and Textures in Ogre 2.0

Being as they have an integrated resource system, the 1.x versions of Ogre have had a somewhat opaque buffer system. Anyone who has worked closely with the system, doing dynamic uploads for example, will have discovered that uploads are done by creating the object and then later uploading the actual data into it.

For its’ era, this was a perfectly adequate design. When it was conceived, this method largely reflected the way that graphics APIs worked – or at the very least permitted you to work. In OpenGL, for example, you create the buffer object then upload data into it using the glBufferData command.

Everything changed with Direct3D 10

## The old ways – A look at Direct3D 9

Direct3D9 divided buffer types using two properties – the “usage” which described how the application intended to use the buffer, and the “pool” which described how the memory was to be allocated. For our purposes, only one usage flag matters:

“D3DUSAGE_DYNAMIC – Set to indicate that the vertex buffer requires dynamic memory use. This is useful for drivers because it enables them to decide where to place the buffer. In general, static vertex buffers are placed in video memory and dynamic vertex buffers are placed in AGP memory. Note that there is no separate static use. If you do not specify D3DUSAGE_DYNAMIC, the vertex buffer is made static. D3DUSAGE_DYNAMIC is strictly enforced through the D3DLOCK_DISCARD and D3DLOCK_NOOVERWRITE locking flags. As a result, D3DLOCK_DISCARD and D3DLOCK_NOOVERWRITE are valid only on vertex buffers created with D3DUSAGE_DYNAMIC- MSDN

D3DUSAGE_DYNAMIC provides the driver with a hint that the application will be modifying the buffer frequently, and omitting it informs it that the application will be updating it infrequently, if at all.

This is a useful abstraction, but we can do better: if the buffer is entirely off limits to the GPU, then the driver can make further optimizations with the aim of increasing performance.

## The changes of Direct3D11

Direct3D11 takes the existing “dynamic/static” disjunction, renaming “static” to “default”, and adds two new modes, bringing us to a total of four:

• D3D11_USAGE_IMMUTABLE – Immutable buffers must be initialised with data when they are uploaded, and from then on are read-only for both the CPU and GPU
• D3D11_USAGE_DEFAULT – Essentially Direct3D 9′s default mode. This is optimized for infrequent updates
• D3D11_USAGE_DYNAMIC – Essentially Direct3D 9′s D3DUSAGE_DYNAMIC. Optimized for frequent updates and for use as a GPU render target
• D3D11_USAGE_STAGING – A specialized buffer usage that cannot be directly read or written by the GPU. D3D11_USAGE_STAGING is used to create DMA capable buffers for quickly copying data from the GPU.

The most important of these is D3D11_USAGE_IMMUTABLE. While Ogre does sometimes copy back data from the GPU, staging buffers are a much smaller performance win than immutable buffers, and have a higher logistical complexity for the engine.

## The changes in Ogre 2.0

The most obvious change is the addition of the enumerants HBU_IMMUTABLE and TU_IMMUTABLE for hardware buffers and textures respectively. These signal to Ogre, and therefore to the driver (where possible) that the contents of the buffer will never change.

Deeper changes have been required to the Texture and HardwareBuffer objects.

### Textures

The Texture API has changed significantly in Ogre 2.0 as the resource system has been changed. In older versions, the Texture object tracked the “source”, “desired” and “internal” versions of various properties – from width and height to pixel format. As Ogre 2.0 is separating the “resident” representation of the resources from their sources, all but the internal versions of the properties have been removed.

Instead, the loader sets the properties to those it desires for the texture object before calling the new _determineInternalFormat function. This function will then use its’ knowledge of the render system’s capabilities in order to determine the actual format to be used.

Once the internal format has been determined, the loader may use the _uploadBuffers function in order to actually fill the texture with data. For all the existing texture types, this is optional; the loader may later get the buffer by using the getBuffer method and then use the blit method of the returned buffer object to do the upload, as was the standard method in previous versions of Ogre. However, when using an immutable texture, the _uploadBuffers method must be used, in order to accommodate the contract associated with immutable textures.

### Other Buffers

Other buffer types are managed separately through the HardwareBufferManager. For each of these types, an additional pointer is being added to their constructor function in order to enable immediate upload.