Sunday, November 28, 2010

Breaking a Tandem down for Travelling

I got a new tandem from Stephen Bilenky yesterday.
My wife and I are both tall (5'10" and 6'4" respectively aka 178 cm and 193 cm) so we went custom, and I asked him to build it with braze-ons for Beckman racks so he tweaked the geometry to fit those.  He also managed to get me an Acai drum brake on the rear so the rear wheel is non-dished between that and the cluster.

He did a nice job with S&S couplers too and gave me the pieces in two standard size shipping cases with nicely labeled velcro padding.  Here's how to pack a tandem with 2 sets of S&S couplers.



The first case contains the front wheel which fits inside the shallow top lid deflated with 35mm tires -- smaller tires might fit inflated but I wouldn't ship inflated tires in an unpressurized compartment.

Visible on the left are the front fork and the front tubes with captain's cranks attached,  The captain's bars and headset are also visible and those have been removed from the fork.  In pictures below I show the details of the shift and brake cables connectors.



Detail of the first case.  The package wrapped in white cloth is a saddle.  At the far right and bottom are two tubes wrapped in velcro padding,



The captain's drop bars curve around the front tubes and you just have to route the cables where they don't get in the way.



Here you can see three white posts sticking up.  Those should be located towards the center of the box, and there are corresponding end caps that go on the other end to prevent the cases from compressing during shipping.

Closing the box requires putting the front wheel over those posts, putting the top caps on, closing the case most of the way, and then finessing the wheel into the space in the lid.



The captain's seat post and saddle goes in the first case wherever it fits.



The second case contains the rear wheel with drum brake facing up, the rear triangle and crank arms wrapped in cloth.  That thing nestled in the triangle is the stoker's saddle.  I have to swap that seatpost for a thudbuster so hopefully it all fits.



The padding has 2 extra velcro straps for the chain.  The chain that connects the captain's and stoker's cranks and the skewers are in a separate envelope.



Detail showing how the saddle fits.



The front cantilever brakes are visible here.



That's a crank arm wrapped up there.


Bilenky was nice enough to write idiot-proof labels on the padding so here are some shots of that.



Rear seat post.



Rear triangle with front derailleur and straps for chain and rear derailleur.



Cables route through padding.




Rear derailleur and cable.




The wrapping goes around the end of the S&S coupler face.



The crank arms are fitted with self-extracting bolts so I need an 8mm allen to reassemble.
The stoker's cranks are separately wrapped in cloth.
Te skewers and the transfer chain are each in separate envelopes, which fit into another envelope.



Each individual tube is wrapped.  You can see the cable guide sans-cable at the left.



The fork our of the head tube.  There are two pieces of padding that have slits to wrap around the brakes, and one for the top.  The stem is not wrapped.




The captain's seat post is wrapped as is the tube for the stoker's bars.




Detail of the padding for the fork.  Notice the slit that allows it to go around the brake.



Notice how the cables don't come out of the bottom bracket guides.  They ship inside the bottom tube padding.



Cables attach with down-tube bosses attached.




Padding for the front tubes has slits to protect the ends.



Exploded view of the padding for the front tube.



Bilenkey head badge.  Spiffy!



Yes, 700x35's with deep rims do fit in the cases deflated.

Monday, June 28, 2010

Utilitarianism

I've called myself a utilitarian for some time but I've recently concluded that utilitarianism is untenable as a moral philosophy.

I'm an engineer so my job is to understand the tradeoffs inherent in designing a device to suit a purpose.  Utilitarianism appealed to me because it treats all ethical decisions as tradeoffs.  I was trying to come up with a definition of tradeoffs and concluded that a tradeoff is any attempt to compare incomparables : e.g., maximum speed versus average power consumption, or cash on hand now versus a distribution of returns later.  There is no natural ordering of such values, so engineers fall back on (hopefully) their client's utility function : Q(low max speed, low power consumption) > Q(high max speed, high power consumption) is meaningful.

Since tradeoffs can only be made in the presence of a utility function, utilitarianism requires a global utility function.  I started to think about the properties such a function would have.  For it to be useful in making moral decisions, it must be computable or at least approximable in a time that is sublinear to the number of inputs given that utilitarianism says that everything is potentially an input -- potentially the entirety of the moral actor's past light cone.  It must also be able to concretely balance short term gains against long term gains unless one is to forever be sacrificing present good for some distant utopia.

Then I realized that my client's utility function is part of the dynamic system that describes them -- it might be better for my client that they change their utility function than that I really remove seatbelts to save weight.  This is not a problem for an engineer since it is my job to understand what the client wants, not worry about their mental health (within reason).  But utilitarianism equates "good" and "utility" so I can't make conative assumptions ; if I did, the global utility function would just be a fancy name for a personal god for which I have no evidence.  This is not fatal to a global utility function since many functions have fixed points, but it does eliminate many classes of simple functions so there is reason to be skeptical that there exists an approximation that requires only a very small portion of the inputs to the global utility function.

These criteria (approximability, tractability, and horizon-independence) are not trivial so I can't just assert that such a function exists in the same way that any other mental abstraction exists.  Unless I have positive reason for believing such a function exists and that I can apply it to solve moral questions, then I have no business calling myself a utilitarian.

So for now, I'm just an ex-utilitarian who thinks that the techniques engineers use to optimize designs can be useful in thinking about moral priorities, but that there are probably other principled ways to the same end.

Tuesday, June 15, 2010

Reversing Code Bloat with thr JavaScript Knowledge Base

Lindsey Simon and I just announced JSKB on the google code blog about browser-specific JavaScript optimization.  I wanted to expand on that.  Unfortunately, I don't have the technical chops necessary to put nice looking charts into blogs, so take a look here.

Wednesday, June 9, 2010

Talking at OWASP in Stockholm

I'm going to be talking at OWASP's AppSec Research Conference in Stockholm the week after next (23 June).

Jasvir Nagra and I are talking about virtualization as a strategy for bolting new security policies onto systems that have major legacy constraints, e.g. the web.  If we have time, we're going to discuss some of the language changes that Tom Van Cutsem and Mark Miller (who I believe is presenting at OOPSLA) have proposed for EcmaScript.







Beyond the Same Origin Policy
Jasvir Nagra and Mike Samuel, Google Inc.

The same-origin policy has governed interaction between client-side code and user data since Netscape 2.0, but new development techniques are rendering it obsolete. Traditionally, a website consisted of server-side code written by trusted, in-house developers ; and a minimum of client-side code written by the same in-house devs. The same-origin policy worked because it didn't matter whether code ran server-side or client-side ; the user was interacting with code produced by the same organization. But today, complex applications are being written almost entirely in client-side code requiring developers to specialize and share code across organizational boundaries.

This talk will explain how the same-origin policy is breaking down, give examples of attacks, discuss the properties that any alternative must have, introduce a number of alternative models being examined by the Secure EcmaScript committee and other standards bodies, demonstrate how they do or don't thwart these attacks, and discuss how secure interactive documents could open up new markets for web developers. We assume a basic familiarity with web application protocols : HTTP, HTML, JavaScript, CSS ; and common classes of attacks : XSS, XSRF, Phishing.

Monday, April 26, 2010

The sweet spot in internet security.

New standards like HTML5 often try to address security problems by providing new security features. I think some of these are well-intentioned but are not moving us to a fundamentally more secure internet.

Let me commit blog suicide by starting with a chart that'll drive away most of my potential audience. (Bye :)
Introduces New ToolsImproves Security of Existing Tools
Large Audience
(Web Devs)
Limited. Slow to take hold. E.g., toStaticHtmlGood, but doesn't address zero-days. E.g. PHP magic quotes.
Small Audience
(Library Authors & Security folk)
Good, but won't help legacy apps. E.g. <iframe sandbox="...">The sweet spot. A (relatively) small group can address emerging threats quickly.
E.g., native JSON support

Small groups respond more quickly

The chart is broken down by the size of audience needed to work with the new feature. There are many more web programmers than library developers and web security professionals put together. Any tool that requires most web programmers to use it and use it consistently is limited in its effects. Conversely, if a small audience can change a system to be more secure, then it can have wide-reaching effects. For example, browser developers, a very small audience are now implementing native JSON support. This means that many of the websites out there that have old buggy versions of JSON.js for which there are known script injection exploits, will become more secure because those old libraries were written expecting JSON to become standard and so defer to the newer native implementation on browsers where it exists.

Fix existing code; don't only enable newer better code

The chart is also broken down by scope; will the changes only affect new code that uses new APIs, or does it fundamentally improve the security properties of existing code. If it requires developers to update code, then it can definitely help, but its effects will be very slow to be felt, since existing sites will have to be updated to use it. Website developers are notoriously conservative, and often rightly so ; you can go out of business by updating too soon and breaking your site for old browsers.

Not everything needs to be in the bottom-right box and not everything can be ; there are no silver bullets in security. It is good to have new tools, and to have tools targeted at the web development community at large, but we should always prefer the bottom-right.

Getting Ahead of the Game

New tools also have a problem. Many new security tools are reactions to emerging threats -- the threat model has changed so the HTML and Ecmascript committee respond with new tools that are suited to the new threat model.

But the threat model is only going to keep changing. To get to the sweet spot, we need meta-tools that help small groups of security professionals bolt new security policies onto existing code while maintaining backwards compatibility. And a way to get these changes onto the web at large by targeting a small group -- content aggregators and social hubs that use a plugin model are a good candidate. Ideally we'll get to the point where we have a patch-and-update cycle for the web. As the threat model changes, security specialists use the meta-tools to adapt existing systems to just keep working, but better.

I think the best candidate for meta-tools are software virtualization tools, like the membranes and proxies proposed for Ecmascript 5. And my project, Caja is aiming at providing that layer. We're not in the sweet-spot yet : we've virtualized the DOM and network so we're able to provide a huge array of security policies, but our questionable library support keeps us out of the sweet spot; we're working on that.

Friday, April 16, 2010

Talk on the semantic gap

Google's security team is expanding, so we're doing some recruiting. To help out, I gave a talk to Dawn Song's class at Berkeley.

In brief my argument was
  • Programming language design choices affect the kinds of vulnerabilities that programs written in those languages are susceptible to.
  • The source of these vulnerabilities is not just ignorance by programmers, but includes rational trade-offs between correctness/security and tersity, completeness, maintainability, efficiency, and other concerns.
  • A "semantic gap" exists where programmers (intentionally or unintentionally) use an abstraction that doesn't do quite what they want it to do.
  • Often this gap is innocuous (silent overflow in 64b increment) but sometimes it has catastrophic consequences (naive string interpolation → shell injection).
  • It is possible to close some of these gaps without unduly breaking existing programs by using static analysis, delayed binding, and opt-in defaults to infer intent.
Take a look at my slides.

"Security by Closing the Semantic Gap"

Security is about more than just cryptography; programming language design choices affect the way programmers design programs. We start with code samples in popular programming languages and show how it can be easy to write code that is almost correct, but that fails in ways that are catastrophic security-wise. We demonstrate how tweaking language definitions can close the "semantic gap," the difference between the intended effect of the code and its actual semantics which allows exploitable vulnerabilities to creep in.

Tuesday, March 23, 2010

Macros for EcmaScript

I gave a talk a while back about a proposal for adding macros to JavaScript. I'm likely going to be presenting that to the EcmaScript Harmony working group in the next few days. You can take a look at the slides for a talk I gave to the friam group at HP research a while back.

Wednesday, February 10, 2010

EcmaScript 5 and Harmony

I gave a talk on EcmaScript 5 and 6. My slides are available as HTML.

My pre-talk notes below capture the gist of what I said:

I'm sure you're all aware that browsers and standards bodies have been working on new versions of JavaScript and HTML. I've been working on client side code for a while, so I'm excited about a lot of these changes. I worked on GWS, and then was the first client dev on calendar. I worked on Closure Compiler, doing optimization and type system work. Recently, I've been working with the EcmaScript committee to try and make it a better language for writing secure and robust programs.

I'd like to go over some of the languages changes in the recently release ES 5, and summarize some of the topics being discussed for the next version. Then I'm going to talk about some strategies that might help you start using new language features in code that needs to work on older interpreters.

There's a lot to cover, so I'm going to go pretty quickly, so please jump in if you see something that grabs your interest.

First, a bit of history. When I say "EcmaScript," I mean "JavaScript." Browsers are working on implementing the latest version EcmaScript 5. Most of you are probably familiar with EcmaScript 3. That leaves ES4 which never made it out of committee. ES4 went on for years without producing a standard, and it's now officially dead. You may be familiar with some of the proposals that came out of it, like the type system on which Closure Compiler's type system is based. But most of the work from that did not make it into ES5.

While ES4 was languishing, the ES3.1 group started from scratch with more modest goals, and eventually produced ES5. There are a few principles that they used to decide what to include and what not to. They wanted to standardize extensions and quirks fixes that 3 out of 4 major intepreters agreed on to move divergent implementations towards compatibility, and address some of the major sources of confusion using an opt-in mode that breaks backwards compatibility.

First major change. ES5 lets user code deal with properties the same way browser intrinsics like DOM nodes do. Many interpreters had implemented getters, setters, watchers, or some similar mechanism; but each was different. This example shows a way of defining a getter and setter using a syntax familiar to anyone who has used SpiderMonkey extensions. This is just a convenience for a richer API though.

defineProperty exposes the full power of the new property APIs. It lets you add a new property to an existing object - either a value property or a dynamic property. And you can specify property attributes -- whether the property is read-only, shows up in a for...in loop, and whether the property can be deleted or reconfigured. The top example shows a value property, and the bottom shows a dynamic property definition.

These APIs are a bit unwieldy, so there are conveniences on top of them. Object.defineProperties lets one combine multiple property definitions. Object.create combines object creation with property definition. Object.freeze makes all properties read-only and unconfigurable, and flips a bit on the object so that new properties can't be added. This gives a shallow immutability which can help code expose objects to the outside world, while maintaining invariants without all kinds of defensive copying.

ES5 also standardizes a lot of interfaces that various libraries have proven very useful. Maybe you're familiar with Closure's goog.array. These interfaces are very similar but implemented natively, and available even without a full-scale library. There are all the usual functional operators, plus a few extra flavors of reduce. There's also a curry operator, the new bind method of function. You give bind a value for this, and optionally some positional parameters and it returns a function that delegates to the original with those parameters.

Another API that most JS libraries reinvent are ones for serializing and deserializing data. Many do this badly, and the number of JSON parsers floating around with known XSS holes is depressing. Most modern interpreters have native JSON support built in. This is a safer way to unpack data. And on calendar, we had to jump through all kinds of hoops to get data unpacking to be as fast as possible. The JSON grammar is significantly simpler than the JS grammar, so native JSON parsing is often faster than eval. It's roughly twice as fast as eval on FF 3.5, though I think eval on V8 is currently faster since they've spent more time optimizing it. One of the nice things you may not be aware of with the new JSON APIs is revivers and replacers which let you serialize/deserialize types other than Array and Object. You can also use it to get better compaction, by doing your own constant pooling. This example shows something that I did for calendar to get our messages smaller -- since instances of repeating events often contain repeated strings. You can use JSON revivers to get the same effect without needing arbitrary JSON. This code is not ideal for a slide, but it shows the basic contract of JSON revivers, and replacers are basically the dual of this.

Besides new APIs, the ES5 group tried to fix spec errors that led to unnecessary confusion. The ES3 spec specified the curly bracket object notation in terms of looking up the name Object in the current scope and calling it as a constructor. This meant that a whole slew of identifiers -- Object, Array, Function -- had special significance. That's no longer the case. Most ES3 interpreters stopped doing this because it allowed some really sneaky cross-domain attacks.

The second example shows another unintended consequence of poor spec language. Function calling is specified in terms of an abstraction called, appropriately enough, a lexical scope. The spec said that a lexical scope is an instance of Object, and so property lookup would find not just declared local variables, but any variables on Object.prototype.

Third, some interpreters pooled regular expression objects. This is problematic because regular expressions are stateful. If a regular expression is being used in a global match, or is used without supplying a new string, then its behavior depends on lastIndex. Which can be mutated by code called in between matches.

Lastly, it was very hard for interpreters to optimize local variable access because of eval. ES3 said that interpreters could raise an error any time eval called via a name other than "eval", but few did. In the above code, not only can eval be used to steal data across scope boundaries, but it show why interpreters can't do lots of interesting optimizations unless they can prove that eval is never called. Now, if eval is called by that name, and refers to the same function as the original global eval, then it reaches into the local scope. Otherwise, all free variables are interpreted in the global scope.

ES strict mode is an opt-in mode that removes a number of other sources of bugs and confusion.
You opt in by putting the string literal "use strict", by itself, at the top of your program or function body.
Any functions contained, and any code evaled inside a strict context is itself strict.
Strict mode differs from normal mode in a few ways. "this" is not coerced to an object in strict mode.

In the Boolean.prototype.not example, this would normally be coerced to an Object, so !this is always false. Not in strict mode.
this can also be null or undefined. It won't magically become the global scope, so code which inadvertently used call or apply with null won't accidentally clobber global definitions. Because of this, in strict mode, there are strictly fewer type coercions.

And, basic operations no longer fail silently. Things like deleting a non-existent property or setting a read-only property will throw an exception instead.

Where ES5 was concerned with fixing known problems, ES6 is concerned with defining a few features that let developers write programs that simply couldn't work under ES5 due to efficiency or scaling issues, or kludgey grammar.
I'm going to walk through a number of ideas the committee is toying with. Some of these are pretty likely to make it into the next version, and some are highly speculative, but I'd love to get feedback on what people like and don't like. After that, I'll talk about strategies for incorporating new language features into code that needs to also run on older interpreters.

One of the things that everyone wanted to get into ES5 but couldn't was let-scoping. If you've seen "let is the new var" t-shirts, this is what they're talking about. Right now, all variables declared in a function are visible everywhere in that function, modulo catch blocks. This leads to a lot of confusion around closures and variables apparently declared in loops. I'm sure you've all had trouble with something like this example where a closure accesses a loop counter i. That problem will go away with let scoping.

A much more controversial proposal is lambdas. If you write a lot of code in JS in a functional style, this would be useful to you. Most versions of lambdas are a reduced syntax for closures that repair the Tennent's correspondence violations that JS functions have. this, arguments, break, continue, and return, all change meaning as soon as you put them inside a function. With lambdas, that is not the case. If anyone is familiar with Neal Gafter's closures for Java proposal, this is similar.

Most everyone agreed that some support for classes is needed, but there's no concensus on exactly what that is. The general feeling is that some syntactic sugar over existing things like constructors and prototypes is probably the best way. This syntactic sugar approach would make it very easy to use classes in code, and down-compile to older versions of ES.

ES has no IdentityHashMap type, or System.identityHashCode. And it probably never will, but sometimes it's hard to write some algorithms without it -- e.g. keeping a list of visited nodes to avoid cycles. Ephemeron tables hash by identity, but without exposing indeterminism. There's no way to iterate over keys, so key order is not a source of indeterminism. And all keys are weakly held, so they don't complicate garbage collection. A value in an ephemeron table is only reachable by the garbage collector if the table and key are independently reachable.
Something like this proposal seems to have fairly broad support.

There seems to be general consensus that modules are needed, but there are active discussions of both means and goals. Some questions raised are how this ineracts with browser fetching, whether modules should be parameterizable ; basically should modules be stateful, or should the state be moved to instances ; and is an isolation mechanism needed so that code can be isolated from changes to Object.prototype and the like. Ihab Awad did a lot of spec writing on that, so can answer detailed questions afterwards.

The committee is listening to proposals around domain-specific language support in ES. DSLs like E4X and CSS selectors have been implemented as extensions or in libraries. So they're definitely useful for creating content in other languages, and for query languages. They could be used to implement new flow control constructs if done right. In the first example here, someone wants to create a regular expression to match a mime-content boundary. But the boundary string could contain special characters, so they use a DSL. The DSL desugars to a function call that gets the literal string portions, and embedded expressions as lambdas. If the re function is const, then interpreters can inline the result since all the parameters are statically known.
These DSLs can also be used to do structured string interpolation -- the syntactic flexibility of perl string interpolations without the horrible XSS problems.
And the last example shows a control structure.

There's also a lot of support for value types. Types that don't compare for equality using reference identity. IBM very much wants decimal arithmetic in, and this is one way to do this. It would also let structured string interpolation behave like real strings.

So that lets user defined code implement new value types. Proxies give user code great flexibility over reference types. I mentioned that there is a known hole in getters and setters. You can't intercept accesses to properties that you haven't defined. Properties do that in a way that preserves the semantics of Object freezing, and avoids a lot of complexity around prototype chains. Instead of defining a property handler of last resort on an existing object, proxies are new objects that delegate all property accesses to handler functions. Those functions can in turn delegate to another object.

Destructuring assignments are syntactic shorthand for unpacking data structures. They are not a general purpose pattern based decomposition as in OCAML or Scala.

Iterators and generators are another idea for which there is wide support but no definite plans.

Thursday, January 28, 2010

Cryonics skepticism and the rockstar exemption

[Update 16 April 2010: After seeing http://wondermark.com/614/ I recant the below. If I might live to see hover toasters, then sign me up!]


I was talking to a friend who has been looking into cryonics. I'm skeptical for the reasons below. This has nothing to do with the good faith of the cryonics companies around today, but with the incentives of future generations.

The claim of cryonics as I understand it is that
It is possible to freeze an ailing human body (or part thereof), wait for medical technology to progress to the point where it can be returned to good health, and then revive it.

This is a reasonable course if all the following hold
(1) that there is an appreciable chance of being revived
(2) that the person revived will survive for an appreciable time after being revived
(3) that the person revived is very similar to the person frozen
(4) that the time at which the person is revived will appeal to the person being revived

I think these 4 assumptions are unlikely to hold at the same time.

Scenario A
A person is frozen, and in their children's or grandchildren's lifetime are revived. In this case, I think 2 is unlikely since medical technology will not have had time to advance much. This may not hold for people suffering from very particular afflictions, especially if they are wealthy and create a foundation to use their wealth to seek targeted advances.

Scenario B
A person is frozen, and in the distant future, medical technology has solved their affliction. In this case, I think 1 is unlikely.
I assert that never in human history, has human civilization produced a machine with a moving part that has continued to function for 100 years without regular maintenance.
This means there is a trade-off between cost of maintenance and cost to revive. Consider two ends of the spectrum. If a cryogenically frozen person is buried in a particularly slow-moving glacier, then maintenance costs are low, but cost to revive is high. If a frozen person is warehoused, then maintenance cost is high, but the cost to revive is lower.

I argue that no-one who would be able to revive them would have economic incentives to do so with a few exceptions that I discuss later.
If the frozen has no assets then noone has incentives to keep the machines running.
Who has either legal standing to act on behalf of the frozen, or economic incentive to see someone successfully unfrozen? Very few.
The cryogenics company has an annuity as long as someone stays frozen. If no next of kin can be identified, the cost of losing an annuity due to death is probably lower than the cost of successfully reviving someone. The cryogenics companies' business models are basically like a family that keeps cashing welfare checks for a grandmother who is disabled or dead.
Lawyers working for the frozen's estate have mixed incentives. Law firms conglomerate like other industries, so after a certain amount of time, the number of long term cryogenically frozen clients whose estates are not administered by a law firm that specializes in frozen clients will be small. They can charge an annuity, and when it comes to decide whether or not to revive, they run the risk of killing a paying client and ending their annuity or continuing to collect a check. A law firm that specializes in such things will have internal controls set up so that they act in the most risk-averse way -- they will never revive anyone unless someone else with legal standing threatens to sue.
Who might have such standing? The RIAA. Since fashion is fickle and cyclic, the RIAA might succeed in getting frozen artists revived so that they can do reunion tours and then, oops their cancer comes back right on cue, and they have to go back to sleep.
Finally, is there a marketing incentive? Not in scenario B. A cryogenics company will want people to have the impression that their latest&greatest are the most reliable, so they will bias to later models. And they will want to use in their marketing literature pretty people who their current target markets can relate to, so will again bias to recent customers.

The above is largely based on arguments from ignorance -- who might have economic incentives? -- but I believe such arguments are valid because I am discussing whether it is rational to do this today, not whether it might be rational for someone in the future.

There are obviously incentives other than economic and legal. If you are a famous scientist, politician, or religious figure, other
incentives apply. For scientists, the likelihood of being unfrozen decreases as your field progresses past your current skill. As a political figure, (4) is problematic since you're just as likely to be revived to stand trial as to be lauded. As with many things, the best course seems to be a god, but ancestor worship is unreliable since current ancestor worshipping cultures do not have to deal with the likelihood of their in-laws coming back from the grave unbidden.

Finally, is a hugely rich person likely to be revived? I doubt it. They are an annuity while frozen, and an unknown political threat to the powers that be if unfrozen.