Forwarding Orderly JSON - v0
Recently I proposed orderly, an idea for a small microlanguage on top of JSONSchema—something easier to read and write.
There’s been some great feedback which I find encouraging. In response I’ve set up orderly-json.org and started a project on github which will host the specification, the reference implementation, and all of the contents of the json-orderly.org site.
recent changes
Given initial feedback and a bunch more thought about orderly, theres a couple significant early changes that have taken place.
repetition quantifiers get curly, enumeration values go square
in v-1 (yes, negative one) the following code would have specified an integer property with a range of allowable values:
integer foo[0,10];
And the following, would specify an integer property bar with allowable values 0-10.
integer bar[0,10]
uh. oops? We need a way outta this mess. The current v0 specification moves range specification to after the type (suggested by Toby A Inkster, and changes syntax from square braces to curly braces. We’re re-using “repetition quantifiers” from regular expressions here, and while the analogy isn’t perfect, nor is the square brace analogy where storage allocation is the expectation. Enumeration values, in turn, have more exclusive access to square braces, which is nice because this is the JSON representation of arrays. Here’s an example of a property weight which must be in a certain numeric range:
number{0.02, 0.98} weight;
While possible values are represented after the type using JSON style array notation:
string os ["osx", "win32", "linux", "freebsd"];
An enumeration property on JSONSchema is nothing more than an array of potential values. And a more beautiful seque into the next point, I couldn’t have engineered..
Orderly syntax is a superset of JSON
JSON is wholly contained within the syntax of orderly, and the reason for this is simple. Because orderly must be able to represent default values of any object of arbitrary complexity. The bad news is this means it’s going to be a little more work to parse, but the good news is there’s plenty of precedent to leverages (numerical representation, etc).
What’s next?
As the language starts to firm up, I’ve resolved to take popular APIs and try to describe them with orderly, see how far we get. Today’s target is the BrowserPlus JSON WSAPI that returns a list of available services, here’s a subset of the data we’re trying to describe:
[
{
"name": "PublishSubscribe",
"versionString": "1.0.0",
"os": "ind",
"size": 5114,
"documentation": "A cross document message service that allows JavaScript to send and receive messages between web pages within one or more browsers (cross document + cross process).",
"CoreletType": "dependent",
"CoreletRequires": {
"Name": "RubyInterpreter",
"Version": "4",
"Minversion": "4.2.5"
}
},
{
"name": "Uploader",
"versionString": "3.2.6",
"os": "osx",
"size": 279378,
"documentation": "This service lets you upload files faster and easier than before.",
"CoreletType": "standalone",
"CoreletAPIVersion": 4
},
{
"name": "RubyInterpreter",
"versionString": "4.2.6",
"os": "osx",
"size": 1691095,
"documentation": "Allows other services to be written in Ruby.",
"CoreletType": "provider",
"CoreletAPIVersion": 4
}
]And here’s concocted orderly to describe this data:
# A schema describing the data returned from the BrowserPlus services
# API at http://browserplus.yahoo.com/api/v3/corelets/osx
array {
object {
string name;
string versionString;
string os [ "ind", "osx", "win32" ];
integer size;
string documentation;
string CoreletType [ "standalone", "dependent", "provider" ];
# if CoreletType is "standalone" or "provider", then
# CoreletAPIVersion must be present
integer CoreletAPIVersion ?;
# if CoreletType is "dependent", then CoreletRequires must be present
object {
string Name;
string Version;
string Minversion;
} CoreletRequires ?;
};
};
Not bad. Some semantics of this ad-hoc WSAPI cannot be captured in orderly nor it seems the JSONSchema underneath, but I think I’m ok with that. You?
lloyd
Orderly JSONSchema 7
I’ve always wanted a concise and beautiful schema language for JSON. This desire stems from a real world need that I’ve hit repeatedly. Given in-memory data that has been hydrated from a stream of JSON, of questionable quality, validation is required. Currently I’m constantly performing JSON validation in an ad-hoc manner, that is laboriously writing boiler plate code validating that an input JSON document is of the form that I expect.
Manual validation is problematic for a variety of reasons, and there are several features afforded by automatic validation. My favorite being high quality helpful error messages upon bogus inputs. Aaron Boodman has talked a bit about the why over on his blog
So what do I want?
- A terse yet flexible means of describing the structure of a JSON document
- Something that’s easy on the eyes
- Something that rolls off the tounge
JSONSchema’s diet
“But wait!”—you exclaim! There’s JSONSchema ! And I agree, JSONSchema is mostly a good thing, and gets us most of the way there. JSONSchema is a flexible means of describing the structure of a JSON document. But I wouldn’t call it terse. Taken from json-schema.org, compare the complexity of the JSON document:
{
"name" : "John Doe",
"born" : "",
"gender" : "male",
"address" :
{"street":"123 S Main St",
"city":"Springfield",
"state":"CA"}
}
With the schema that describes it:
[lth@lappro yajl] $ json_reformat < doc.txt
{
"description": "A person",
"type": "object",
"properties": {
"name": {
"type": "string"
},
"born": {
"type": [
"integer",
"string"
],
"minimum": 1900,
"maximum": 2010,
"format": "date-time",
"optional": true
},
"gender": {
"type": "string",
"options": [
{
"value": "male",
"label": "Guy"
},
{
"value": "female",
"label": "Gal"
}
]
},
"address": {
"type": "object",
"properties": {
"street": {
"type": "string"
},
"city": {
"type": "string"
},
"state": {
"type": "string"
}
}
}
}
}
NOTE: I did run this through json_reformat, the pretty printer that ships with yajl – so to be fair, we could combine some lines here.
Now don’t get me wrong. I believe that the feature of JSONSchema that it can be represented in JSON is very important. This means that there’s less bloat in the core toolchain when you choose JSON for some portion of your data representation needs, and that holds up to Douglas’s promise of a “low fat alternative”. Rad. But I don’t like how hard that schema is to read and write for a human like me.
So let’s throw a stone as long as we’re driving by: JSONSchema is too big
I think too much has been asked of JSONSchema, from the proposal:
JSON Schema is intended to provide validation, documentation, and interaction control of JSON data.
Interaction control? A cute idea, but I think this is far less important than a functional small language for validation. Perhaps there’s actually one base specification here with some extensions to do interaction control and storage attributes? (read about the transient attribute). Finally, with documentation, I’m again uncertain. Here’s the full list of attributes that make me nervous
- options (label/value)
- title
- description
- transient
- hidden
- disallow
- extends
- identity
Introducing Orderly (v-1)
Orderly, say hi!
string hi {"wassup"};
Orderly…
- ... is an ergonomic micro-language that can round-trip to JSONSchema.
- ... presently represents a subset of JSONSchema – I’ve thrown out the bits not specifically related to validation.
- ... is optional. syntactic sugar. fluff. Tools should speak JSONSchema, but for areas where humans have to read or write the schema there should be an option to expose orderly in addition to JSON.
- ... is probably not novel. “nothing under the sun is new”.
- ... is a “little baby zygote of an idea”
So lets’ meet orderly. This JSONSchema:
{"type":"object",
"properties":
{"name": {"type":"string"},
"age" : {"type":"integer",
"maximum":125}}
}
becomes this orderly:
object {
string name;
integer age[,125];
};
nice, eh? Let’s zip through some examples here:
A string property named name:string name;A string property between 1 and 64 chars in length (I assume unicode points here):
string name[1,64];A number named foo between 100 and 1000
number foo[100,1000];An optional boolean named hasLotsOfMoney:
[boolean hasLotsOfMoney];An optional number with a value between 1 and 200 with a default value of 18:
[number age[1,200] = 18];
And for our final example, let’s transmogrify that huge schema up top:
object {
string name;
union {
integer[1900,2010];
string; // OMG, I killed format!
} born;
string gender { "male", "female" }; // OMG, I killed interaction control!
object {
string street;
string city;
string state;
} address;
} person;
So we’re nowhere near a BNF here, this is simply the part where we walk into the store and start trying things on. Oh, and don’t worry. This isn’t real.
fixing lockfile gem (v1.4.3) for ruby 1.9
[lth@clover sup]$ diff /usr/lib/ruby/gems/1.9.1/gems/lockfile-1.4.3/lib/lockfile.rb{~,}
475c475
< buf.each do |line|
---
> buf.split($/).each do |line|
installing the sup MUA on 64bit arch linux 1
You’ll probably get here from a google search on trying to figure out how to get sup running on your arch box that was recently upgraded to ruby 1.9. sure it hurts, but it’s progress! Pick a different distro if you don’t wanna play!
Getting sup running, quick and dirty and highly time dependent:
installing the ferret gem
gem install sup will fail miserably, first you’ll find that “ferret” isn’t installing. This problem has been solved, sure would be nice to get this accepted up stream :/
installing ncurses
try again, now ncurses ain’t installing!
here’s some steps that should work:
$ gem fetch ncurses $ gem unpack ncurses-0.9.1.gem $ ruby extconf.rb $ patch -p1 < ~/ncurses_ruby_1.9.patch $ make $ sudo make install
Here’s the contents of ncurses_ruby_1.9.patch:
--- ncurses-0.9.1.orig/form_wrap.c 2009-09-24 10:53:41.000000000 -0600
+++ ncurses-0.9.1/form_wrap.c 2009-09-24 10:52:02.000000000 -0600
@@ -392,7 +392,7 @@
*/
static VALUE rbncurs_m_new_form(VALUE dummy, VALUE rb_field_array)
{
- long n = RARRAY(rb_field_array)->len;
+ long n = RARRAY_LEN(rb_field_array);
/* Will ncurses free this array? If not, must do it after calling free_form(). */
FIELD** fields = ALLOC_N(FIELD*, (n+1));
long i;
@@ -616,7 +616,7 @@
rb_raise(rb_eArgError, "TYPE_ENUM requires three additional arguments");
}
else {
- int n = RARRAY(arg3)->len;
+ int n = RARRAY_LEN(arg3);
/* Will ncurses free this array of strings in free_field()? */
char** list = ALLOC_N(char*, n+1);
int i;
@@ -775,7 +775,7 @@
* form_field
*/
static VALUE rbncurs_c_set_form_fields(VALUE rb_form, VALUE rb_field_array) {
- long n = RARRAY(rb_field_array)->len;
+ long n = RARRAY_LEN(rb_field_array);
/* If ncurses does not free memory used by the previous array of strings, */
/* we will have to do it now. */
FIELD** fields = ALLOC_N(FIELD*, (n+1));
@@ -1123,7 +1123,7 @@
VALUE argc = rb_funcall(proc, rb_intern("arity"),0);
VALUE args = get_proc(field, FIELDTYPE_ARGS);
if (args != Qnil) {
- if (NUM2INT(argc)-1 != RARRAY(args)->len) {
+ if (NUM2INT(argc)-1 != RARRAY_LEN(args)) {
char msg[500];
snprintf(msg, 500, "The validation functions for this field type need %d additional arguments.",NUM2INT(argc)-1);
msg[499]=0;
the rest…
First, clone yourself a copy of sup from the gitorius hosted repo:
git clone git://gitorious.org/sup/mainline.gitAnd follow the instructions:
- install all gems referenced in the rakefile
- as the author suggests
ruby -I lib bin/sup
Behold! It starts! Now is it actually usable is another question…
bi-directional git <-> svn - part I
What’s the point?
There are many reasons why git-svn integration is interesting, and most of them are sociological. Here are some situations where git-svn integration can be useful:
- You work at a place that has standardized on SVN, but want to use git as your personal “svn client”
- your company’s svn servers have horrid performance, and you want to continue to be productive when they’re not
- You want a fast and powerful web view of your svn repo (and what’s centrally provided isn’t cutting it).
- You are considering transitioning to a DVCS but you (or maybe those you work with?) are not willing to jump in without trying it on.
Git’s svn integration supports all of these uses quite naturally. This series of articles shall drop code which touches on these usages, but the real goal here is to explore a harmonious bi-directional svn<->git arrangement, one where novel commits may occur on both sides of the boundary without b0rking repos or pissing off susan.
a kick-ass svn client
@mojodna has written up his work flow, and I find that to be a great exploration into some of the issues around using git as a svn client.
the read-only mirror
Here’s the purely hypothetical scenario: you’ve got this big ol’ svn repo maintained by some fabulous folks. For whatever reason, those fabulous folks have provided you with a less-than-fabulous web view into your repository. You want to create an automatically updated git mirror of your SVN repository. You heavily dig on the simple, clean, and fast feel of cgit. This is uni-directional, but we’re just getting our feet wet.
Onward…
- pick the box that’s pulling the source and servin the pages
- setup limited passphraseless access for a headless user on them machines (an excercise for the reader, thas’ u)
- git svn clone -s (as the appropriate user)
Now we’ve got two issues, first is that periodic bit, and second is that git svn clone will not create any local refs to mirror remote branches. git branch -r will list em, but how might we go about turning svn branches named ‘tags/7.8.10’ into real git tags, and likewise, remote svn branches into local branches?
the periodic part
this part’s simple, use your friend cron and plunk some git svn fetch && git svn rebase into periodic execution.
the “ref” mirroring part
Here’s a little script I wrote, I called it git-svn-mirror-refs:
#!/usr/bin/env bash
for r in `git branch -r | grep -v trunk`; do
istag=x$(echo $r | egrep -v '^tags')
if [ "$istag" == "x" ] ; then
tn=$(echo $r | sed -e 's/^tags\///')
git tag -f $tn refs/remotes/tags/$tn
else
git branch --track -f $r refs/remotes/$r
fi
done
mirroring redux
What did we gain? A fast and simple website where folks can browse the source of your project, pull tarballs of any arbitrary ref or commit, and view changes online. This is probably not huge for you… You’re probably not sold yet. Well, come back for part II. And in the mean time, don’t forget to provide your users with ergonomic urls:
Options +FollowSymLinks RewriteEngine On RewriteRule ^[0-9a-f]+$ /index.cgi/platform/commit/?id=$0 RewriteRule ^svn/([0-9a-f]+)$ /index.cgi/platform.svn/commit/?id=$1
next time…
... we’ll take a look at taking changes made in git and getting them into svn, and the opposite…
—ll
Drags a droppin' and events a bubblin'...
In fiddling more and more with whiz bang HTML drag and drop (in safari 4.x and Firefox 3.5), some things caught me by surprise, primarily because I had already had an idea about "how drag and drop works" that wasn't from the web world. Specifically, in BrowserPlus we invented a very simple model for a web developer to express interest in capturing desktop sourced file drags. Our model was motivated more by ease of implementation and simplicity than by deep adherence to the "precedent" set by browser vendors. At that point there wasn't all that much in the way of precedent....
Anyhow, I wanted to document the way both browser native DnD works and BrowserPlus, if for nothing else as a note to myself. Let's begin with a live sample that you might enjoy if you're on a late model Safari or Firefox (untested elsewhere, YMMV).A DnD Sample
- use Safari 4.x or Firefox 3.5+ please
- view source and check your console.log()
- blue descends from yellow.
- red is yellows sister.
- yellow is our "drop target".
- Drag events propagate along the node hierarchy - if you hover over blue, the drag handle attached to yellow (his parent) is receiving the drag. Blue himself has no handler.
- Overlapping non-descendants can block events - Red is yellow's sister. She has her own drag handler set in order to update the status display, but if she did not, we wouldn't see an event in yellow when hovering over the area in red that overlaps with yellow.
- Your JavaScript must handle bubbled enter/leave events from children - If you have a node that you wish to be a drop target with any number of visible children, you must handle the fact that a transition from the target (yellow) to the child (blue) would result in an enter in the latter followed by a leave in the former.
What would BrowserPlus do?
Again, realize that the Drag and drop implementation in BrowserPlus, for better or worse, was driven by two key goals:- satisfy the real world requirements of our users.
- develop something possible to implement in some sane fashion (from the other side of a plugin API, btw).
Unorganized thoughts...
With HTML DnD you have to think too hard. In every implementation I've seen that leverages drag and drop, there's a *lot* going on inside the "drop area". So each "typical" web app that uses DnD needs to understand how to effectively make a node "drop transparent". Restated: the HTML interface to drag and drop is very powerful and flexible, but, unless I'm missing something, it makes the simple case way too hard. Here's a good example of world class UI that leverages drag and drop (yeah, I'm biased):
There's a containing div and a bunch of stylized descendants contained within. Essentially what we'd want is some simple way to mark all of them "drop transparent"...
Allowing file drops defies user expectation. For years now we've been dropping files on our browser to *load* them (Safari is a great PDF reader). Now with the ability for web-pages to capture our drops, we've got to work harder to prevent poor usability and confusion... What do I mean when I drop 'Hilaiel L 2008 Taxes.pdf' on my browser window? Specifically, 1mm can separate attaching a photo to your email and displaying that photo and discarding your email (most sites will ask for user confirmation, that's one simple way to mitigate this confusion). There isn't a great answer here, I can see.
Finally, I realize not all of this is new, but it's new to me. I eagerly welcome simple code samples which robustly implement the dumb and simple BrowserPlus model, using HTML DnD...
--llchromium on 64 bit (arch) linux
Really not much to write about, it was trivial to do, and feels a hell of a lot faster than the burning fox. Steps?
First, grab the latest build from the chrome buildbot.
Second, probably notice that the chrome binary won’t run for you… missing shared libs? Heeey, me too! Apparently we’re building with certain debug libs here. use @ldd@ to figger out what’s missin, and go create some symlinks:
[lth@clover chrome-linux]$ ldd chrome | egrep \\.[0-9]d
libnss3.so.1d => /usr/lib/libnss3.so.1d (0x00007fe64a846000)
libnssutil3.so.1d => /usr/lib/libnssutil3.so.1d (0x00007fe64a628000)
libsmime3.so.1d => /usr/lib/libsmime3.so.1d (0x00007fe64a3fd000)
libssl3.so.1d => /usr/lib/libssl3.so.1d (0x00007fe64a1cd000)
libplds4.so.0d => /usr/lib/libplds4.so.0d (0x00007fe649fca000)
libplc4.so.0d => /usr/lib/libplc4.so.0d (0x00007fe649dc6000)
libnspr4.so.0d => /usr/lib/libnspr4.so.0d (0x00007fe649b8a000)
Sunspider reports on my box chromium’s js performance is is 4.87x faster than that of the fox. Don’t get too excited, javascript performance isn’t everything …
TEST COMPARISON FROM TO DETAILS
=============================================================================
** TOTAL **: 4.87x as fast 2619.0ms +/- 1.2% 537.6ms +/- 2.0% significant
=============================================================================
3d: 3.57x as fast 327.4ms +/- 2.5% 91.8ms +/- 10.1% significant
cube: 3.59x as fast 117.0ms +/- 1.1% 32.6ms +/- 22.7% significant
morph: 3.52x as fast 112.0ms +/- 5.7% 31.8ms +/- 14.2% significant
raytrace: 3.59x as fast 98.4ms +/- 2.3% 27.4ms +/- 4.1% significant
access: 10.9x as fast 451.8ms +/- 1.4% 41.4ms +/- 3.4% significant
binary-trees: 17.5x as fast 38.4ms +/- 1.8% 2.2ms +/- 25.3% significant
fannkuch: 14.1x as fast 206.4ms +/- 1.2% 14.6ms +/- 4.7% significant
nbody: 7.44x as fast 148.8ms +/- 0.9% 20.0ms +/- 4.4% significant
nsieve: 12.7x as fast 58.2ms +/- 7.5% 4.6ms +/- 14.8% significant
bitops: 8.84x as fast 367.6ms +/- 3.6% 41.6ms +/- 1.6% significant
3bit-bits-in-byte: 13.9x as fast 47.2ms +/- 1.2% 3.4ms +/- 20.0% significant
bits-in-byte: 9.29x as fast 83.6ms +/- 7.6% 9.0ms +/- 9.8% significant
bitwise-and: 11.1x as fast 135.0ms +/- 7.0% 12.2ms +/- 4.6% significant
nsieve-bits: 5.99x as fast 101.8ms +/- 2.6% 17.0ms +/- 0.0% significant
controlflow: 13.2x as fast 39.6ms +/- 2.8% 3.0ms +/- 0.0% significant
recursive: 13.2x as fast 39.6ms +/- 2.8% 3.0ms +/- 0.0% significant
crypto: 5.11x as fast 160.4ms +/- 0.9% 31.4ms +/- 4.5% significant
aes: 6.08x as fast 64.4ms +/- 1.7% 10.6ms +/- 10.5% significant
md5: 4.44x as fast 46.2ms +/- 1.2% 10.4ms +/- 6.5% significant
sha1: 4.79x as fast 49.8ms +/- 1.1% 10.4ms +/- 6.5% significant
date: 2.33x as fast 175.0ms +/- 2.8% 75.2ms +/- 3.9% significant
format-tofte: 2.59x as fast 79.4ms +/- 4.9% 30.6ms +/- 6.2% significant
format-xparb: 2.14x as fast 95.6ms +/- 1.5% 44.6ms +/- 3.2% significant
math: 5.58x as fast 325.8ms +/- 3.0% 58.4ms +/- 3.6% significant
cordic: 5.49x as fast 125.2ms +/- 2.4% 22.8ms +/- 4.6% significant
partial-sums: 5.55x as fast 134.4ms +/- 4.7% 24.2ms +/- 4.3% significant
spectral-norm: 5.81x as fast 66.2ms +/- 4.1% 11.4ms +/- 6.0% significant
regexp: 18.0x as fast 223.8ms +/- 7.9% 12.4ms +/- 5.5% significant
dna: 18.0x as fast 223.8ms +/- 7.9% 12.4ms +/- 5.5% significant
string: 3.00x as fast 547.6ms +/- 3.2% 182.4ms +/- 3.0% significant
base64: 2.57x as fast 48.8ms +/- 3.3% 19.0ms +/- 8.0% significant
fasta: 4.20x as fast 138.6ms +/- 1.2% 33.0ms +/- 2.7% significant
tagcloud: 3.20x as fast 114.6ms +/- 2.9% 35.8ms +/- 2.9% significant
unpack-code: 2.88x as fast 174.8ms +/- 9.7% 60.6ms +/- 8.7% significant
validate-input: 2.08x as fast 70.8ms +/- 2.9% 34.0ms +/- 2.6% significant
–ll


