footnotes

Posted by lloyd Wed, 03 Feb 2010 20:32:00 GMT

HTML5, now with footnotes.
HTML5!1


1. Has more footnotes than any other implementation. Evar.

(screencap from pluploader)

Web Security

Posted by lloyd Wed, 13 Jan 2010 20:55:00 GMT

Web Security

how the web works

Posted by lloyd Thu, 07 Jan 2010 18:39:00 GMT

Graphical pontification on how the web actually works...

how the web works

...I quickly discovered, however, I needed a much much larger peice of paper.

Forwarding Orderly JSON - v0

Posted by lloyd Tue, 06 Oct 2009 16:22:00 GMT

Recently I proposed orderly, an idea for a small microlanguage on top of JSONSchema—something easier to read and write.

There’s been some great feedback which I find encouraging. In response I’ve set up orderly-json.org and started a project on github which will host the specification, the reference implementation, and all of the contents of the json-orderly.org site.

recent changes

Given initial feedback and a bunch more thought about orderly, theres a couple significant early changes that have taken place.

repetition quantifiers get curly, enumeration values go square

in v-1 (yes, negative one) the following code would have specified an integer property with a range of allowable values:

integer foo[0,10];

And the following, would specify an integer property bar with allowable values 0-10.

integer bar[0,10]

uh. oops? We need a way outta this mess. The current v0 specification moves range specification to after the type (suggested by Toby A Inkster, and changes syntax from square braces to curly braces. We’re re-using “repetition quantifiers” from regular expressions here, and while the analogy isn’t perfect, nor is the square brace analogy where storage allocation is the expectation. Enumeration values, in turn, have more exclusive access to square braces, which is nice because this is the JSON representation of arrays. Here’s an example of a property weight which must be in a certain numeric range:

number{0.02, 0.98} weight;

While possible values are represented after the type using JSON style array notation:

string os ["osx", "win32", "linux", "freebsd"];

An enumeration property on JSONSchema is nothing more than an array of potential values. And a more beautiful seque into the next point, I couldn’t have engineered..

Orderly syntax is a superset of JSON

JSON is wholly contained within the syntax of orderly, and the reason for this is simple. Because orderly must be able to represent default values of any object of arbitrary complexity. The bad news is this means it’s going to be a little more work to parse, but the good news is there’s plenty of precedent to leverages (numerical representation, etc).

What’s next?

As the language starts to firm up, I’ve resolved to take popular APIs and try to describe them with orderly, see how far we get. Today’s target is the BrowserPlus JSON WSAPI that returns a list of available services, here’s a subset of the data we’re trying to describe:

[
  {
    "name": "PublishSubscribe",
    "versionString": "1.0.0",
    "os": "ind",
    "size": 5114,
    "documentation": "A cross document message service that allows JavaScript to send and receive messages between web pages within one or more browsers (cross document + cross process).",
    "CoreletType": "dependent",
    "CoreletRequires": {
      "Name": "RubyInterpreter",
      "Version": "4",
      "Minversion": "4.2.5"
    }
  },  
  {
    "name": "Uploader",
    "versionString": "3.2.6",
    "os": "osx",
    "size": 279378,
    "documentation": "This service lets you upload files faster and easier than before.",
    "CoreletType": "standalone",
    "CoreletAPIVersion": 4
  },
  {
    "name": "RubyInterpreter",
    "versionString": "4.2.6",
    "os": "osx",
    "size": 1691095,
    "documentation": "Allows other services to be written in Ruby.",
    "CoreletType": "provider",
    "CoreletAPIVersion": 4
  }
]

And here’s concocted orderly to describe this data:

# A schema describing the data returned from the BrowserPlus services
# API at http://browserplus.yahoo.com/api/v3/corelets/osx
array {
  object {
    string name;
    string versionString;
    string os [ "ind", "osx", "win32" ];
    integer size;
    string documentation;
    string CoreletType [ "standalone", "dependent", "provider" ];
    # if CoreletType is "standalone" or "provider", then
    # CoreletAPIVersion must be present
    integer CoreletAPIVersion ?;
    # if CoreletType is "dependent", then CoreletRequires must be present
    object {
      string Name;
      string Version;
      string Minversion;
    } CoreletRequires ?;
  };
};

Not bad. Some semantics of this ad-hoc WSAPI cannot be captured in orderly nor it seems the JSONSchema underneath, but I think I’m ok with that. You?

lloyd

Orderly JSONSchema 7

Posted by lloyd Fri, 02 Oct 2009 13:49:00 GMT

I’ve always wanted a concise and beautiful schema language for JSON. This desire stems from a real world need that I’ve hit repeatedly. Given in-memory data that has been hydrated from a stream of JSON, of questionable quality, validation is required. Currently I’m constantly performing JSON validation in an ad-hoc manner, that is laboriously writing boiler plate code validating that an input JSON document is of the form that I expect.

Manual validation is problematic for a variety of reasons, and there are several features afforded by automatic validation. My favorite being high quality helpful error messages upon bogus inputs. Aaron Boodman has talked a bit about the why over on his blog

So what do I want?

  1. A terse yet flexible means of describing the structure of a JSON document
  2. Something that’s easy on the eyes
  3. Something that rolls off the tounge

JSONSchema’s diet

“But wait!”—you exclaim! There’s JSONSchema ! And I agree, JSONSchema is mostly a good thing, and gets us most of the way there. JSONSchema is a flexible means of describing the structure of a JSON document. But I wouldn’t call it terse. Taken from json-schema.org, compare the complexity of the JSON document:

  
{
  "name" : "John Doe",
  "born" : "",
  "gender" : "male",
  "address" : 

   {"street":"123 S Main St",
    "city":"Springfield",
    "state":"CA"}
}

With the schema that describes it:

  
[lth@lappro yajl] $ json_reformat < doc.txt
{
  "description": "A person",
  "type": "object",
  "properties": {
    "name": {
      "type": "string" 
    },
    "born": {
      "type": [
        "integer",
        "string" 
      ],
      "minimum": 1900,
      "maximum": 2010,
      "format": "date-time",
      "optional": true
    },
    "gender": {
      "type": "string",
      "options": [
        {
          "value": "male",
          "label": "Guy" 
        },
        {
          "value": "female",
          "label": "Gal" 
        }
      ]
    },
    "address": {
      "type": "object",
      "properties": {
        "street": {
          "type": "string" 
        },
        "city": {
          "type": "string" 
        },
        "state": {
          "type": "string" 
        }
      }
    }
  }
}

NOTE: I did run this through json_reformat, the pretty printer that ships with yajl – so to be fair, we could combine some lines here.

Now don’t get me wrong. I believe that the feature of JSONSchema that it can be represented in JSON is very important. This means that there’s less bloat in the core toolchain when you choose JSON for some portion of your data representation needs, and that holds up to Douglas’s promise of a “low fat alternative”. Rad. But I don’t like how hard that schema is to read and write for a human like me.

So let’s throw a stone as long as we’re driving by: JSONSchema is too big

I think too much has been asked of JSONSchema, from the proposal:

JSON Schema is intended to provide validation, documentation, and interaction control of JSON data.

Interaction control? A cute idea, but I think this is far less important than a functional small language for validation. Perhaps there’s actually one base specification here with some extensions to do interaction control and storage attributes? (read about the transient attribute). Finally, with documentation, I’m again uncertain. Here’s the full list of attributes that make me nervous

  • options (label/value)
  • title
  • description
  • transient
  • hidden
  • disallow
  • extends
  • identity

Introducing Orderly (v-1)

Orderly, say hi!

  string hi {"wassup"};
Orderly…
  • ... is an ergonomic micro-language that can round-trip to JSONSchema.
  • ... presently represents a subset of JSONSchema – I’ve thrown out the bits not specifically related to validation.
  • ... is optional. syntactic sugar. fluff. Tools should speak JSONSchema, but for areas where humans have to read or write the schema there should be an option to expose orderly in addition to JSON.
  • ... is probably not novel. “nothing under the sun is new”.
  • ... is a “little baby zygote of an idea”

So lets’ meet orderly. This JSONSchema:

{"type":"object",
 "properties":
  {"name": {"type":"string"},
   "age" : {"type":"integer",
     "maximum":125}}
}

becomes this orderly:

object {
  string name;
  integer age[,125];
};

nice, eh? Let’s zip through some examples here:

A string property named name:
string name;
A string property between 1 and 64 chars in length (I assume unicode points here):
string name[1,64];
A number named foo between 100 and 1000
number foo[100,1000];
An optional boolean named hasLotsOfMoney:
[boolean hasLotsOfMoney];
An optional number with a value between 1 and 200 with a default value of 18:
[number age[1,200] = 18];

And for our final example, let’s transmogrify that huge schema up top:

object {
  string name;
  union {
    integer[1900,2010];
    string;                 // OMG, I killed format!
  } born; 
  string gender { "male", "female" }; // OMG, I killed interaction control!
  object {
    string street;
    string city;
    string state;
  } address;  
} person;

So we’re nowhere near a BNF here, this is simply the part where we walk into the store and start trying things on. Oh, and don’t worry. This isn’t real.

fixing lockfile gem (v1.4.3) for ruby 1.9

Posted by lloyd Thu, 24 Sep 2009 18:42:00 GMT

[lth@clover sup]$ diff /usr/lib/ruby/gems/1.9.1/gems/lockfile-1.4.3/lib/lockfile.rb{~,}
475c475
<       buf.each do |line|
---
>       buf.split($/).each do |line|

installing the sup MUA on 64bit arch linux 1

Posted by lloyd Thu, 24 Sep 2009 16:52:00 GMT

You’ll probably get here from a google search on trying to figure out how to get sup running on your arch box that was recently upgraded to ruby 1.9. sure it hurts, but it’s progress! Pick a different distro if you don’t wanna play!

Getting sup running, quick and dirty and highly time dependent:

installing the ferret gem

gem install sup will fail miserably, first you’ll find that “ferret” isn’t installing. This problem has been solved, sure would be nice to get this accepted up stream :/

installing ncurses

try again, now ncurses ain’t installing!

here’s some steps that should work:

 
$ gem fetch ncurses
$ gem unpack ncurses-0.9.1.gem 
$ ruby extconf.rb 
$ patch -p1 < ~/ncurses_ruby_1.9.patch
$ make
$ sudo make install

Here’s the contents of ncurses_ruby_1.9.patch:

--- ncurses-0.9.1.orig/form_wrap.c    2009-09-24 10:53:41.000000000 -0600
+++ ncurses-0.9.1/form_wrap.c    2009-09-24 10:52:02.000000000 -0600
@@ -392,7 +392,7 @@
  */
 static VALUE rbncurs_m_new_form(VALUE dummy, VALUE rb_field_array)
 {
-  long n = RARRAY(rb_field_array)->len;
+  long n = RARRAY_LEN(rb_field_array);
   /* Will ncurses free this array? If not, must do it after calling free_form(). */
   FIELD** fields = ALLOC_N(FIELD*, (n+1));
   long i;  
@@ -616,7 +616,7 @@
         rb_raise(rb_eArgError, "TYPE_ENUM requires three additional arguments");
      }
     else {
-        int n = RARRAY(arg3)->len;
+        int n = RARRAY_LEN(arg3);
         /*  Will ncurses free this array of strings in free_field()? */
         char** list = ALLOC_N(char*, n+1);
         int i;
@@ -775,7 +775,7 @@
  * form_field
  */
 static VALUE rbncurs_c_set_form_fields(VALUE rb_form, VALUE rb_field_array) {
-  long n = RARRAY(rb_field_array)->len;
+  long n = RARRAY_LEN(rb_field_array);
   /*  If ncurses does not free memory used by the previous array of strings, */
   /*  we will have to do it now. */
   FIELD** fields = ALLOC_N(FIELD*, (n+1));
@@ -1123,7 +1123,7 @@
      VALUE argc = rb_funcall(proc, rb_intern("arity"),0);
      VALUE args = get_proc(field, FIELDTYPE_ARGS);
      if (args != Qnil) {        
-        if (NUM2INT(argc)-1 != RARRAY(args)->len) {    
+       if (NUM2INT(argc)-1 != RARRAY_LEN(args)) {    
           char msg[500];
           snprintf(msg, 500, "The validation functions for this field type need %d additional arguments.",NUM2INT(argc)-1);
           msg[499]=0;

the rest…

First, clone yourself a copy of sup from the gitorius hosted repo:

git clone git://gitorious.org/sup/mainline.git
And follow the instructions:
  1. install all gems referenced in the rakefile
  2. as the author suggests ruby -I lib bin/sup

Behold! It starts! Now is it actually usable is another question…

bi-directional git <-> svn - part I

Posted by lloyd Thu, 24 Sep 2009 05:30:00 GMT

What’s the point?

There are many reasons why git-svn integration is interesting, and most of them are sociological. Here are some situations where git-svn integration can be useful:

  1. You work at a place that has standardized on SVN, but want to use git as your personal “svn client”
  2. your company’s svn servers have horrid performance, and you want to continue to be productive when they’re not
  3. You want a fast and powerful web view of your svn repo (and what’s centrally provided isn’t cutting it).
  4. You are considering transitioning to a DVCS but you (or maybe those you work with?) are not willing to jump in without trying it on.

Git’s svn integration supports all of these uses quite naturally. This series of articles shall drop code which touches on these usages, but the real goal here is to explore a harmonious bi-directional svn<->git arrangement, one where novel commits may occur on both sides of the boundary without b0rking repos or pissing off susan.

a kick-ass svn client

@mojodna has written up his work flow, and I find that to be a great exploration into some of the issues around using git as a svn client.

the read-only mirror

Here’s the purely hypothetical scenario: you’ve got this big ol’ svn repo maintained by some fabulous folks. For whatever reason, those fabulous folks have provided you with a less-than-fabulous web view into your repository. You want to create an automatically updated git mirror of your SVN repository. You heavily dig on the simple, clean, and fast feel of cgit. This is uni-directional, but we’re just getting our feet wet.

Onward…

  1. pick the box that’s pulling the source and servin the pages
  2. setup limited passphraseless access for a headless user on them machines (an excercise for the reader, thas’ u)
  3. git svn clone -s (as the appropriate user)

Now we’ve got two issues, first is that periodic bit, and second is that git svn clone will not create any local refs to mirror remote branches. git branch -r will list em, but how might we go about turning svn branches named ‘tags/7.8.10’ into real git tags, and likewise, remote svn branches into local branches?

the periodic part

this part’s simple, use your friend cron and plunk some git svn fetch && git svn rebase into periodic execution.

the “ref” mirroring part

Here’s a little script I wrote, I called it git-svn-mirror-refs:

#!/usr/bin/env bash

for r in `git branch -r | grep -v trunk`; do
  istag=x$(echo $r | egrep -v '^tags')
  if [ "$istag" == "x" ] ; then
    tn=$(echo $r | sed -e 's/^tags\///')
    git tag -f $tn refs/remotes/tags/$tn 
  else
    git branch --track -f $r refs/remotes/$r
  fi
done

mirroring redux

What did we gain? A fast and simple website where folks can browse the source of your project, pull tarballs of any arbitrary ref or commit, and view changes online. This is probably not huge for you… You’re probably not sold yet. Well, come back for part II. And in the mean time, don’t forget to provide your users with ergonomic urls:

Options +FollowSymLinks
RewriteEngine On
RewriteRule ^[0-9a-f]+$ /index.cgi/platform/commit/?id=$0
RewriteRule ^svn/([0-9a-f]+)$ /index.cgi/platform.svn/commit/?id=$1

next time…

... we’ll take a look at taking changes made in git and getting them into svn, and the opposite…

—ll

Drags a droppin' and events a bubblin'...

Posted by lloyd Wed, 16 Sep 2009 17:49:00 GMT

In fiddling more and more with whiz bang HTML drag and drop (in safari 4.x and Firefox 3.5), some things caught me by surprise, primarily because I had already had an idea about "how drag and drop works" that wasn't from the web world. Specifically, in BrowserPlus we invented a very simple model for a web developer to express interest in capturing desktop sourced file drags. Our model was motivated more by ease of implementation and simplicity than by deep adherence to the "precedent" set by browser vendors. At that point there wasn't all that much in the way of precedent....

Anyhow, I wanted to document the way both browser native DnD works and BrowserPlus, if for nothing else as a note to myself. Let's begin with a live sample that you might enjoy if you're on a late model Safari or Firefox (untested elsewhere, YMMV).

A DnD Sample

  • use Safari 4.x or Firefox 3.5+ please
  • view source and check your console.log()
  • blue descends from yellow.
  • red is yellows sister.
  • yellow is our "drop target".
Currently over: nothing
So pick up a file from your desktop and hover it on over the sample. Notice a couple things:
  • Drag events propagate along the node hierarchy - if you hover over blue, the drag handle attached to yellow (his parent) is receiving the drag. Blue himself has no handler.
  • Overlapping non-descendants can block events - Red is yellow's sister. She has her own drag handler set in order to update the status display, but if she did not, we wouldn't see an event in yellow when hovering over the area in red that overlaps with yellow.
  • Your JavaScript must handle bubbled enter/leave events from children - If you have a node that you wish to be a drop target with any number of visible children, you must handle the fact that a transition from the target (yellow) to the child (blue) would result in an enter in the latter followed by a leave in the former.

What would BrowserPlus do?

Again, realize that the Drag and drop implementation in BrowserPlus, for better or worse, was driven by two key goals:
  1. satisfy the real world requirements of our users.
  2. develop something possible to implement in some sane fashion (from the other side of a plugin API, btw).
Given this, it's pretty simple to explain the BrowserPlus model - Any node that is designated as a "drop target" can receive drop events, any node that isn't is 100% transparent. So if you were to designate only yellow above as a drop target, then the drop behavior would be identical regardless of the existence of blue and red.

Unorganized thoughts...

With HTML DnD you have to think too hard. In every implementation I've seen that leverages drag and drop, there's a *lot* going on inside the "drop area". So each "typical" web app that uses DnD needs to understand how to effectively make a node "drop transparent". Restated: the HTML interface to drag and drop is very powerful and flexible, but, unless I'm missing something, it makes the simple case way too hard. Here's a good example of world class UI that leverages drag and drop (yeah, I'm biased):

There's a containing div and a bunch of stylized descendants contained within. Essentially what we'd want is some simple way to mark all of them "drop transparent"...

Allowing file drops defies user expectation. For years now we've been dropping files on our browser to *load* them (Safari is a great PDF reader). Now with the ability for web-pages to capture our drops, we've got to work harder to prevent poor usability and confusion... What do I mean when I drop 'Hilaiel L 2008 Taxes.pdf' on my browser window? Specifically, 1mm can separate attaching a photo to your email and displaying that photo and discarding your email (most sites will ask for user confirmation, that's one simple way to mitigate this confusion). There isn't a great answer here, I can see.

Finally, I realize not all of this is new, but it's new to me. I eagerly welcome simple code samples which robustly implement the dumb and simple BrowserPlus model, using HTML DnD...

--ll

chromium on 64 bit (arch) linux

Posted by lloyd Fri, 11 Sep 2009 20:22:00 GMT

Really not much to write about, it was trivial to do, and feels a hell of a lot faster than the burning fox. Steps?

First, grab the latest build from the chrome buildbot.

Second, probably notice that the chrome binary won’t run for you… missing shared libs? Heeey, me too! Apparently we’re building with certain debug libs here. use @ldd@ to figger out what’s missin, and go create some symlinks:

[lth@clover chrome-linux]$ ldd chrome | egrep \\.[0-9]d 
    libnss3.so.1d => /usr/lib/libnss3.so.1d (0x00007fe64a846000)
    libnssutil3.so.1d => /usr/lib/libnssutil3.so.1d (0x00007fe64a628000)
    libsmime3.so.1d => /usr/lib/libsmime3.so.1d (0x00007fe64a3fd000)
    libssl3.so.1d => /usr/lib/libssl3.so.1d (0x00007fe64a1cd000)
    libplds4.so.0d => /usr/lib/libplds4.so.0d (0x00007fe649fca000)
    libplc4.so.0d => /usr/lib/libplc4.so.0d (0x00007fe649dc6000)
    libnspr4.so.0d => /usr/lib/libnspr4.so.0d (0x00007fe649b8a000)

Sunspider reports on my box chromium’s js performance is is 4.87x faster than that of the fox. Don’t get too excited, javascript performance isn’t everything

TEST                   COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:           4.87x as fast     2619.0ms +/- 1.2%   537.6ms +/- 2.0%     significant

=============================================================================

  3d:                  3.57x as fast      327.4ms +/- 2.5%    91.8ms +/- 10.1%     significant
    cube:              3.59x as fast      117.0ms +/- 1.1%    32.6ms +/- 22.7%     significant
    morph:             3.52x as fast      112.0ms +/- 5.7%    31.8ms +/- 14.2%     significant
    raytrace:          3.59x as fast       98.4ms +/- 2.3%    27.4ms +/- 4.1%     significant

  access:              10.9x as fast      451.8ms +/- 1.4%    41.4ms +/- 3.4%     significant
    binary-trees:      17.5x as fast       38.4ms +/- 1.8%     2.2ms +/- 25.3%     significant
    fannkuch:          14.1x as fast      206.4ms +/- 1.2%    14.6ms +/- 4.7%     significant
    nbody:             7.44x as fast      148.8ms +/- 0.9%    20.0ms +/- 4.4%     significant
    nsieve:            12.7x as fast       58.2ms +/- 7.5%     4.6ms +/- 14.8%     significant

  bitops:              8.84x as fast      367.6ms +/- 3.6%    41.6ms +/- 1.6%     significant
    3bit-bits-in-byte: 13.9x as fast       47.2ms +/- 1.2%     3.4ms +/- 20.0%     significant
    bits-in-byte:      9.29x as fast       83.6ms +/- 7.6%     9.0ms +/- 9.8%     significant
    bitwise-and:       11.1x as fast      135.0ms +/- 7.0%    12.2ms +/- 4.6%     significant
    nsieve-bits:       5.99x as fast      101.8ms +/- 2.6%    17.0ms +/- 0.0%     significant

  controlflow:         13.2x as fast       39.6ms +/- 2.8%     3.0ms +/- 0.0%     significant
    recursive:         13.2x as fast       39.6ms +/- 2.8%     3.0ms +/- 0.0%     significant

  crypto:              5.11x as fast      160.4ms +/- 0.9%    31.4ms +/- 4.5%     significant
    aes:               6.08x as fast       64.4ms +/- 1.7%    10.6ms +/- 10.5%     significant
    md5:               4.44x as fast       46.2ms +/- 1.2%    10.4ms +/- 6.5%     significant
    sha1:              4.79x as fast       49.8ms +/- 1.1%    10.4ms +/- 6.5%     significant

  date:                2.33x as fast      175.0ms +/- 2.8%    75.2ms +/- 3.9%     significant
    format-tofte:      2.59x as fast       79.4ms +/- 4.9%    30.6ms +/- 6.2%     significant
    format-xparb:      2.14x as fast       95.6ms +/- 1.5%    44.6ms +/- 3.2%     significant

  math:                5.58x as fast      325.8ms +/- 3.0%    58.4ms +/- 3.6%     significant
    cordic:            5.49x as fast      125.2ms +/- 2.4%    22.8ms +/- 4.6%     significant
    partial-sums:      5.55x as fast      134.4ms +/- 4.7%    24.2ms +/- 4.3%     significant
    spectral-norm:     5.81x as fast       66.2ms +/- 4.1%    11.4ms +/- 6.0%     significant

  regexp:              18.0x as fast      223.8ms +/- 7.9%    12.4ms +/- 5.5%     significant
    dna:               18.0x as fast      223.8ms +/- 7.9%    12.4ms +/- 5.5%     significant

  string:              3.00x as fast      547.6ms +/- 3.2%   182.4ms +/- 3.0%     significant
    base64:            2.57x as fast       48.8ms +/- 3.3%    19.0ms +/- 8.0%     significant
    fasta:             4.20x as fast      138.6ms +/- 1.2%    33.0ms +/- 2.7%     significant
    tagcloud:          3.20x as fast      114.6ms +/- 2.9%    35.8ms +/- 2.9%     significant
    unpack-code:       2.88x as fast      174.8ms +/- 9.7%    60.6ms +/- 8.7%     significant
    validate-input:    2.08x as fast       70.8ms +/- 2.9%    34.0ms +/- 2.6%     significant

–ll