Forwarding Orderly JSON - v0
Recently I proposed orderly, an idea for a small microlanguage on top of JSONSchema—something easier to read and write.
There’s been some great feedback which I find encouraging. In response I’ve set up orderly-json.org and started a project on github which will host the specification, the reference implementation, and all of the contents of the json-orderly.org site.
recent changes
Given initial feedback and a bunch more thought about orderly, theres a couple significant early changes that have taken place.
repetition quantifiers get curly, enumeration values go square
in v-1 (yes, negative one) the following code would have specified an integer property with a range of allowable values:
integer foo[0,10];
And the following, would specify an integer property bar with allowable values 0-10.
integer bar[0,10]
uh. oops? We need a way outta this mess. The current v0 specification moves range specification to after the type (suggested by Toby A Inkster, and changes syntax from square braces to curly braces. We’re re-using “repetition quantifiers” from regular expressions here, and while the analogy isn’t perfect, nor is the square brace analogy where storage allocation is the expectation. Enumeration values, in turn, have more exclusive access to square braces, which is nice because this is the JSON representation of arrays. Here’s an example of a property weight which must be in a certain numeric range:
number{0.02, 0.98} weight;
While possible values are represented after the type using JSON style array notation:
string os ["osx", "win32", "linux", "freebsd"];
An enumeration property on JSONSchema is nothing more than an array of potential values. And a more beautiful seque into the next point, I couldn’t have engineered..
Orderly syntax is a superset of JSON
JSON is wholly contained within the syntax of orderly, and the reason for this is simple. Because orderly must be able to represent default values of any object of arbitrary complexity. The bad news is this means it’s going to be a little more work to parse, but the good news is there’s plenty of precedent to leverages (numerical representation, etc).
What’s next?
As the language starts to firm up, I’ve resolved to take popular APIs and try to describe them with orderly, see how far we get. Today’s target is the BrowserPlus JSON WSAPI that returns a list of available services, here’s a subset of the data we’re trying to describe:
[
{
"name": "PublishSubscribe",
"versionString": "1.0.0",
"os": "ind",
"size": 5114,
"documentation": "A cross document message service that allows JavaScript to send and receive messages between web pages within one or more browsers (cross document + cross process).",
"CoreletType": "dependent",
"CoreletRequires": {
"Name": "RubyInterpreter",
"Version": "4",
"Minversion": "4.2.5"
}
},
{
"name": "Uploader",
"versionString": "3.2.6",
"os": "osx",
"size": 279378,
"documentation": "This service lets you upload files faster and easier than before.",
"CoreletType": "standalone",
"CoreletAPIVersion": 4
},
{
"name": "RubyInterpreter",
"versionString": "4.2.6",
"os": "osx",
"size": 1691095,
"documentation": "Allows other services to be written in Ruby.",
"CoreletType": "provider",
"CoreletAPIVersion": 4
}
]And here’s concocted orderly to describe this data:
# A schema describing the data returned from the BrowserPlus services
# API at http://browserplus.yahoo.com/api/v3/corelets/osx
array {
object {
string name;
string versionString;
string os [ "ind", "osx", "win32" ];
integer size;
string documentation;
string CoreletType [ "standalone", "dependent", "provider" ];
# if CoreletType is "standalone" or "provider", then
# CoreletAPIVersion must be present
integer CoreletAPIVersion ?;
# if CoreletType is "dependent", then CoreletRequires must be present
object {
string Name;
string Version;
string Minversion;
} CoreletRequires ?;
};
};
Not bad. Some semantics of this ad-hoc WSAPI cannot be captured in orderly nor it seems the JSONSchema underneath, but I think I’m ok with that. You?
lloyd
Orderly JSONSchema 7
I’ve always wanted a concise and beautiful schema language for JSON. This desire stems from a real world need that I’ve hit repeatedly. Given in-memory data that has been hydrated from a stream of JSON, of questionable quality, validation is required. Currently I’m constantly performing JSON validation in an ad-hoc manner, that is laboriously writing boiler plate code validating that an input JSON document is of the form that I expect.
Manual validation is problematic for a variety of reasons, and there are several features afforded by automatic validation. My favorite being high quality helpful error messages upon bogus inputs. Aaron Boodman has talked a bit about the why over on his blog
So what do I want?
- A terse yet flexible means of describing the structure of a JSON document
- Something that’s easy on the eyes
- Something that rolls off the tounge
JSONSchema’s diet
“But wait!”—you exclaim! There’s JSONSchema ! And I agree, JSONSchema is mostly a good thing, and gets us most of the way there. JSONSchema is a flexible means of describing the structure of a JSON document. But I wouldn’t call it terse. Taken from json-schema.org, compare the complexity of the JSON document:
{
"name" : "John Doe",
"born" : "",
"gender" : "male",
"address" :
{"street":"123 S Main St",
"city":"Springfield",
"state":"CA"}
}
With the schema that describes it:
[lth@lappro yajl] $ json_reformat < doc.txt
{
"description": "A person",
"type": "object",
"properties": {
"name": {
"type": "string"
},
"born": {
"type": [
"integer",
"string"
],
"minimum": 1900,
"maximum": 2010,
"format": "date-time",
"optional": true
},
"gender": {
"type": "string",
"options": [
{
"value": "male",
"label": "Guy"
},
{
"value": "female",
"label": "Gal"
}
]
},
"address": {
"type": "object",
"properties": {
"street": {
"type": "string"
},
"city": {
"type": "string"
},
"state": {
"type": "string"
}
}
}
}
}
NOTE: I did run this through json_reformat, the pretty printer that ships with yajl – so to be fair, we could combine some lines here.
Now don’t get me wrong. I believe that the feature of JSONSchema that it can be represented in JSON is very important. This means that there’s less bloat in the core toolchain when you choose JSON for some portion of your data representation needs, and that holds up to Douglas’s promise of a “low fat alternative”. Rad. But I don’t like how hard that schema is to read and write for a human like me.
So let’s throw a stone as long as we’re driving by: JSONSchema is too big
I think too much has been asked of JSONSchema, from the proposal:
JSON Schema is intended to provide validation, documentation, and interaction control of JSON data.
Interaction control? A cute idea, but I think this is far less important than a functional small language for validation. Perhaps there’s actually one base specification here with some extensions to do interaction control and storage attributes? (read about the transient attribute). Finally, with documentation, I’m again uncertain. Here’s the full list of attributes that make me nervous
- options (label/value)
- title
- description
- transient
- hidden
- disallow
- extends
- identity
Introducing Orderly (v-1)
Orderly, say hi!
string hi {"wassup"};
Orderly…
- ... is an ergonomic micro-language that can round-trip to JSONSchema.
- ... presently represents a subset of JSONSchema – I’ve thrown out the bits not specifically related to validation.
- ... is optional. syntactic sugar. fluff. Tools should speak JSONSchema, but for areas where humans have to read or write the schema there should be an option to expose orderly in addition to JSON.
- ... is probably not novel. “nothing under the sun is new”.
- ... is a “little baby zygote of an idea”
So lets’ meet orderly. This JSONSchema:
{"type":"object",
"properties":
{"name": {"type":"string"},
"age" : {"type":"integer",
"maximum":125}}
}
becomes this orderly:
object {
string name;
integer age[,125];
};
nice, eh? Let’s zip through some examples here:
A string property named name:string name;A string property between 1 and 64 chars in length (I assume unicode points here):
string name[1,64];A number named foo between 100 and 1000
number foo[100,1000];An optional boolean named hasLotsOfMoney:
[boolean hasLotsOfMoney];An optional number with a value between 1 and 200 with a default value of 18:
[number age[1,200] = 18];
And for our final example, let’s transmogrify that huge schema up top:
object {
string name;
union {
integer[1900,2010];
string; // OMG, I killed format!
} born;
string gender { "male", "female" }; // OMG, I killed interaction control!
object {
string street;
string city;
string state;
} address;
} person;
So we’re nowhere near a BNF here, this is simply the part where we walk into the store and start trying things on. Oh, and don’t worry. This isn’t real.
