Skip to content

Latest commit

 

History

History
294 lines (235 loc) · 10.1 KB

index.adoc

File metadata and controls

294 lines (235 loc) · 10.1 KB

Lightweight JSON Parsing in C

What

A Very Lightweight JSON Parser

json.h defines a simple interface and parse tree in simple C that can be used to represent data received in JSON format.

json.c is a lightweight implementation of that interface.

A JSON Parser for the Korn shell

jsoncvt is a program built on that parser that can convert JSON into Korn shell expressions. It also supports output in XML, in the event you have an XML parser but no JSON parser available in your environment.

Primarily, this is for people who need to read JSON in ksh93, but don’t want to mess with shared libraries (“builtins”) and the like.

Why

The parser exists because, as I looked through some of the existing parsers advertised for C at http://json.org/ I was surprised at how large some of them were. JSON is an intentionally small and simple format; a parser for it should be equally small and simple.

The entire specification for the parser is just 32 lines of code, and its implementation comes in at under 500 lines of code. This is strictly written to C99 and POSIX.1-2001; it should compile nearly anywhere that’s reasonable.

Note
Later, I learned I had overlooked json-parser, which is very similar in spirit to this one. If I had looked closer and evaluated every C implementation out there, I might have found and selected this one instead of writing my own.

I needed a JSON parser because I wanted to work with JSON data in some Korn shell programming, which is how jsoncvt came to be. As recent versions of ksh93 (neé ksh) are pretty rich in their supported types (compound variables, booleans and integers and reals (oh my), and arrays of those types), I wanted something that would yield native and rich structures in ksh93.

How

Source Overview

The sources to jsoncvt are written assuming only C99 and POSIX.1-2001. They should compile just about anywhere reasonable without modification, without any warnings or other diagnostics.

json.h, json.c

The heart of the software, a fast and lightweight JSON parser.

ksh.h, ksh.c

Emits a parsed JSON tree in ksh93 syntax.

xml.h, xml.c

Emits a parsed JSON tree in XML syntax. A DTD describing the emitted XML is available here.

twine.h, twine.c

A set of functions for building simple C strings.

ptrvec.h, ptrvec.c

A set of functions for building vectors of pointers.

sanity.h, sanity.c

Functions that help maintain my sanity.

Building

Primarily, json.h and json.c are the files of interest. You wouldn’t install those; instead, you can just include them into your own project. The parser is small enough to fit in just those files for exactly that reason. As provided, they use ptrvec and twine which shouldn’t conflict with anything your project, but feel free to rename them as necessary. Porting to use your own replacements for ptrvec, sanity, and twine should be a piece of cake; there is nothing elaborate in there.

But jsoncvt is a useful utility, and you might want to make that generally available on your system. Easy peasy:

  1. Use make to build. Supply CFLAGS and maybe even CC if you need to, but straight out the box, it should be okay. There’s no need for autoconf, iffe, or anything like that. Standards are good that way.

  2. Place the resulting jsoncvt into a …​/bin directory of your choice.

  3. Place jsoncvt.1 into a …​/man/man1 directory of your choice.

Tip
Some installations of C compilers support the C99 standard, but aren’t compliant by default. That’s no problem. In those cases, you just need to pass an appropriate CFLAGS setting on the command line. For example, to coax some GNU cc installations along, you might want to run make like this

$ make CFLAGS=-std=c99

Thanks to Jukka Inkeri for this tip.

Using jsoncvt

There is a manual page, as alluded to above.

jsoncvt operates as a filter. For example, let’s say you had this JSON stream:

{
    "first" : "Bob",
    "last" : "Krzaczek",
    "email" : "Robert.Krzaczek@gmail.com",
    "lucky" : 13,
    "quarter" : 0.25,
    "empty" : null,
    "nerd" : true,
    "lotto" : [ 9, 12, 17, 38, 45, 46 ]
}

Running jsoncvt -k <krz.json would yield

compound foobar=(
  first=$'Bob'
  last=$'Krzaczek'
  email=$'Robert.Krzaczek@gmail.com'
  integer lucky=13
  float quarter=0.25
  empty=
  bool nerd=true
  integer -a lotto=(
    9
    12
    17
    38
    45
    46
  )
)

So, in your ksh program, you could do things like the following. Note that the name of the variable defined by jsoncvt is foo, optionally named right there on the command line.

$ eval "$(jsoncvt -k foo <krz.json)"

$ print "${foo.email}"
Robert.Krzaczek@gmail.com

$ print "${foo.lotto[*]}"
9 12 17 38 45 46

Using the Parser

A description of the jvalue tree appears in json.h. This, plus a handful of functions like jparse() and jupdate() are all there is to the parser API.

Open a stdio file stream to read the JSON data that needs to be parsed, and supply it to jparse(). Either a pointer to a JSON value is returned (which recursively represents the parse tree), or NULL is returned when something horrible happens during parsing.

For example, the following minimum program, in which we’re unprofessionally skipping all error checks and other reasonable behavior, is all that’s needed to parse and manipulate a JSON tree.

#include <stdio.h>
#include <string.h>
#include "json.h"

int
main()
{
    FILE *fp = fopen( "krz.json", "r" );
    jvalue *krz = jparse( fp );                    (1)
    fclose( fp );

    for( jvalue **j = krz->u.v; *j; ++j )
        if( !strcmp( (*j)->n, "email" ))           (2)
            printf( "address: %s\n", (*j)->u.s );

    return 0;
}
  1. That’s all there is to it; at the heart of things, it’s a single function call.

  2. If you need to parse huge JSON objects, I could easily add some kind of hash to the jvalue, rather than relying on silliness like strcmp(3). On the other hand, it’s simple, demonstrative, and often fast enough.

Each node in the tree is described by a discriminator member d which takes on one of these values: jnull, jtrue, jfalse, jstring, jnumber, jarray, and jobject.

Note

The returned tree leaves numeric values as strings, because in my usage, I’m converting values and don’t want the usual imprecision of converting from decimal strings to internal representations and then back to decimal strings.

If your program will work with the data, and you want the numeric values as native integers and reals, call jupdate() on the parse tree, and all jnumber nodes will be converted to jinteger or jreal, activating other parts of the jvalue union accordingly.

You can safely combine these calls, if you like. In the previous example, you might make these changes:

jvalue *krz = jupdate( jparse( fp ));
...
    else if( !strcmp( (*j)->n, "quarter" ))
        printf( "quarter: %Lf\n", (*j)->u.r );

License

Copyright ⓒ 2014-2017 Robert S. Krzaczek.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHOR OR COPYRIGHT HOLDER BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Other Copyrights

While the code presented in sanity.h and sanity.c is original, it is certainly inspired by the excellent book, “The Practice of Programming” by Brian W. Kernighan and Rob Pike. Quoting from that source:

You may use this code for any purpose, as long as you leave the copyright notice and book citation attached. Copyright © 1999 Lucent Technologies. All rights reserved. Mon Mar 19 13:59:27 EST 2001

Changelog

1.0.8

Fixed a bug; empty objects are working again.

1.0.7

Added -A flag to use associative arrays instead of compound variables for ksh output. This allows the full range of name strings in JSON objects to be used.

1.0.6

Feedback from Robert Reay and Jukka Inkeri gratefully acknowledged: tightened up the parser, and added note to the documentation about compiling for C99.

Miscellaneous

'Cause what the world needs now
is another JSON parser
like I need a hole in my head.
— with apologies to Cracker