7

I'd like to prepare a test for a REST petstore server by the mean of curl.

To test its add new pet api, I prepare a json string that holds the Pet to insert in a bash function that executes the curl:

#!/bin/bash

# Add new pet
add_petstore() {
  local rest_url="$1"

  local id_pet="$2"
  local id_category="$3"
  local category_name="$4"
  local pet_name="$5"
  local photo_url="$6"
  local tag_id="$7"
  local tag_name="$8"

  local pet_to_add="{ \
        \"id\": $id_pet, \
        \"category\": {  \
          \"id\": $id_category,  \
          \"name\": \"$category_name\"  \
        },  \
        \"name\": \"$pet_name\",  \
        \"photoUrls\": [  \
          \"$photo_url\"  \
        ],  \
        \"tags\": [  \
          {
            \"id\": $tag_id,  \
            \"name\": \"$tag_name\"  \
          }  \
        ]  \
      }"

  echo "$pet_to_add"

  curl -X 'POST' \
    "$rest_url" \
    -H 'accept: application/xml' \
    -H 'Content-Type: application/json' \
    -d "$pet_to_add"
}

add_petstore "http://localhost:8080/pet" "151" "12" "Dogs" "REX" "http://photosofrex/rex_rage.jpg" "1" "ferocious sales"

The echo of $pet_to_add looks the one I willing to have:

{         "id": 151,         "category": {            "id": 12,            "name": "Dogs"          },          "name": "REX",          "photoUrls": [            "http://photosofrex/rex_rage.jpg"          ],          "tags": [            {
            "id": 1,              "name": "ferocious sales"            }          ]        }

and add_petstore method allows me to prepare few pets easily.


But the local pet_to_add=... declaration is really dirty.
If someone (or me later) has to adapt this script, this local variable isn't welcoming.

I thought first that I could put its content in a file and read it with a local pet_to_add=$(cat myfile). But this wouldn't resolve it's variable parameters then.

Do I have a way to write that local pet_to_add declaration a cleaner way?

7
  • 1
    Save the template JSON in a file like you say, and then use jq to update it with env variables?
    – muru
    Commented May 22 at 5:43
  • Encoding JSON manually is a guaranteed recipe for issues. Once of these days you there will be something different in one of your variables (like double quotes, or a backslash) and it will break. While the jq method of Kusalananda is safer, it's quite cumbersome. Is there a specific reason you're using bash? It would be sooooo much simpler with node.js, perl, php, and probably many other languages with support for associative arrays and JSON encoding.
    – jcaron
    Commented May 24 at 13:47
  • 1
    rest_url should also use quotes around "$1" Commented May 24 at 15:57
  • @MarcLeBihan I'd urge you to evaluate what happens with your accepted answer when pet names contain double quotes. That's doubly true if this is applied in a context where an attacker adding arbitrary key/value pairs could trigger arbitrary code execution or other unwanted behavior. (Consider also a name of hello"},{"id":"evil","name":"secondpet -- if records represent, say, accounts being created, someone could create a second account with arbitrary settings by creating a huge name in their one legit account). You're much better off with jq --arg. Commented May 24 at 18:49
  • 1
    @MarcLeBihan, ...there's more than one kind of simplicity: Looking simple is one thing, but being simple is another. I'd argue that something prone to unexpected behaviors (like injection attacks!) is actually more complex, even if it looks simpler: to fully understand its behavior you need to think about a wider array of corner cases. Oversimplifying your mental model to ignore the corner cases is where bugs come from. Commented May 24 at 20:14

3 Answers 3

13

Using jq to create pieces of JSON for each sub-element of the top-level payload document and then putting everything together. This way, you ensure that jq has an opportunity to encode every string, and you avoid injecting shell variables into JSON (which may break the document; see, e.g., the $pet_name value used below).

First, you'd set up the shell variables with their values. You could use values read from the command line.

pet_id=bee001
pet_name='Eric "the half a bee" II'

category_id=bee
category_name='The bees'

# I elected to do the tags as an associative array.
declare -A tags
tags=(
        [beauty]=high
        [cost]=medium
        [social]=low
)

# I'm assuming there can by many photo URLs.
photo_URLs=(
        "url1"
        "url2"
        "url3"
)

We then create the JSON for the category part:

# Create category
category_json=$(
        jq -c -n \
                --arg id "$category_id" \
                --arg name "$category_name" \
                '$ARGS.named'
)

This would create a single JSON object with id and name keys. The values for the key are taken from the given shell variables. The $ARGS.named thing is an internal jq object containing all keys+values given on the command line using --arg or --argjson (for values that should not be, or already are, string encoded). Since the jq expression is single-quoted, the shell will not mistake $ARGS for a shell variable.

We then create the photo URL array for the photoUrls part:

# Create photo URL array
photoUrls_json=$(
        jq -c -n \
                '$ARGS.positional' \
                --args "${photo_URLs[@]}"
)

This uses jq with --args, which populates the internal $ARGS.positional array with the given values. Note that --args and its argument list must always occur last on the command line.

Next, we create the tags part from our associative array:

# Create tags
tags_json=$(
        for tag_id in "${!tags[@]}"; do
                jq -c -n \
                        --arg id "$tag_id" \
                        --arg name "${tags[$tag_id]}" \
                        '$ARGS.named'
        done |
        jq -c -s '.'
)

We create the array elements from the keys and values of the associative shell array in a loop and insert them in an array with jq -s reading from the loop.

If your tag IDs are numeric, do tags as an ordinary bash array:

tags=(                                                                                                                                                                    
        [1]=high                                                                                                                                                          
        [12]=medium                                                                                                                                                       
        [90]=low                                                                                                                                                          
)

... and then convert it into JSON using

# Create tags
tags_json=$(
        for tag_id in "${!tags[@]}"; do
                jq -c -n \
                        --argjson id "$tag_id" \
                        --arg name "${tags[$tag_id]}" \
                        '$ARGS.named'
        done |
        jq -c -s '.'
)

Notice --argjson id "$tag_id" in place of --arg id "$tag_id" to pass the number, which should not be converted into a string.

We may then finally put together the JSON payload from these parts:

# Create final JSON payload
payload_json=$(
        jq -c -n \
                --arg id "$pet_id" \
                --argjson category "$category_json" \
                --arg name "$pet_name" \
                --argjson photoUrls "$photoUrls_json" \
                --argjson tags "$tags_json" \
                '$ARGS.named'
)

# DEBUG:
printf 'payload is\n%s\n' "$payload_json"

The bash script above would output a JSON document equivalent to the following (but in a compact form):

{
  "id": "bee001",
  "category": {
    "id": "bee",
    "name": "The bees"
  },
  "name": "Eric \"the half a bee\" II",
  "photoUrls": [
    "url1",
    "url2",
    "url3"
  ],
  "tags": [
    {
      "id": "beauty",
      "name": "high"
    },
    {
      "id": "cost",
      "name": "medium"
    },
    {
      "id": "social",
      "name": "low"
    }
  ]
}
4
  • 1
    jq looks powerful, but it seems to complicated and long to make working for short and easy to maintain as possible tests. Commented May 22 at 9:13
  • Hmm... with the caveat that I haven't sat down and tried to do this myself (so I might be missing something), I suspect there's a much less complicated and shorter way to do this with jq. At least, in the many times I've used jq myself, I've never invoked it this many times in the same program. I don't have time right now but maybe later I could add an answer to back that up.
    – David Z
    Commented May 22 at 20:35
  • 1
    @DavidZ What's the complicated bit? It sure is easier to construct each sub-thing separately from the given shell variables, and then to combine these into a final payload. Trying to pass everything at once is certainly possible, but it would make for one rather large jq expression that might be more difficult to maintain than a handful of simple and straight-forward constructors.
    – Kusalananda
    Commented May 23 at 18:06
  • Shorter, sure, but I'm not sure about "less complicated". That said, one thing I've done is passing in the document to be processed as the first input, then data to add to that document as subsequent inputs; write your jq expression to iterate over those inputs, and you get something that's much more efficient than the code shown here (since it's only invoking the jq executable once), but at a cost in complexity/readability. Commented May 24 at 18:45
12

Yes, there is a way that's cleaner for many types of JSON documents (though not all types), a here document:

local pet_to_add=$(cat - <<END_DOC
  {
    "id": $id_pet,
    "category": {
      "id": $id_category,
      "name": "$category_name"
    },
    "name": "$pet_name",
    "photoUrls": [
      "$photo_url"
    ],
    "tags": [
      {
        "id": $tag_id,
        "name": "$tag_name"
      }
    ]
  }
END_DOC
)

For this kind of use case, I think of a here document as a form of double-quoted string because it expands shell/environment variables in the document, but it doesn't remove double quote characters because they aren't the string delimiters.


When your JSON document is small you can use the builtin printf command to insert values in the middle of the document and save the resulting JSON in a shell variable:

local example_json
printf -v example_json '{ "name": "%s", "id": "%s" }' "${pet_name}" "${id_pet}"
4
  • 8
    That's certainly the simplest option - as long as there's no concern about escaping JSON-unsafe characters in the shell variables. Not that the OP handles that anyway, but it's something to keep in mind. Commented May 22 at 21:36
  • No need for - since that's the default for cat, unless you want it for clarity. Commented May 24 at 15:54
  • 1
    I would also replace cat with jq -c . to get the here-doc as compact json. Commented May 24 at 15:58
  • If you're going to switch to jq, then you can use its --arg to pass the arguments out-of-band so they get correctly escaped. If a pet's name is, say, Choco "The Scratcher", jq will add the necessary backslashes before the "s, but the techniques shown here won't. Commented May 24 at 18:43
3

You're currently creating a string that's a mixture of format (names of fields, structure chars like { and [, etc.) and data (arguments to the function). I'd separate the data from the format and use a here document to populate that format using - in the here doc start delimiter to provide tab indenting (leading tabs in the code where you populate the string don't appear in the output) and wrapping the here doc delimiter in 's so the whole format is treated as a single-quoted string to protect it from any possible shell expansion (e.g. Cost $5 would not expand $5 as a positional parameter), and then just let "$@" expand to fill in the values of the placeholder %ss in the format which as a side benefit means you don't need all those other local variables:

$ cat tst.sh
#!/usr/bin/env bash

add_petstore() {
  local rest_url=$1 fmt pet_to_add bad_chars="\""
  shift

  for arg; do
     if [[ $arg =~ $bad_chars ]]; then
        printf 'Error: "%s" contains a char we cannot accept.\n' "$arg" > &2
        return 1
     fi
  done

  IFS= read -r -d '' fmt <<-'!'
        {
          "id": %s,
          "category": {
            "id": %s,
            "name": "%s"
          },
          "name": "%s",
          "photoUrls": [
            "%s"
          ],
          "tags": [
            {
              "id": %s,
              "name": "%s"
            }
          ]
        }
        !

  printf -v pet_to_add "$fmt" "$@"
  echo "$pet_to_add"
}

add_petstore "http://localhost:8080/pet" "151" "12" "Dogs" "REX" "http://photosofrex/rex_rage.jpg" "1" "ferocious sales"

$ ./tst.sh
{
  "id": 151,
  "category": {
    "id": 12,
    "name": "Dogs"
  },
  "name": "REX",
  "photoUrls": [
    "http://photosofrex/rex_rage.jpg"
  ],
  "tags": [
    {
      "id": 1,
      "name": "ferocious sales"
    }
  ]
}

The leading spaces within the here document are tabs so they'll be ignored courtesy of the - in <<-. If you actually want those tabs in your output then change <<- to just <<.

If you really don't want newlines in pet_to_add but do want to use them when populating the format string then just change this:

printf -v pet_to_add "$fmt" "${@}"

to this:

printf -v pet_to_add "${fmt//$'\n'}" "${@}"

and then you'll get this output:

$ ./tst.sh
{  "id": 151,  "category": {    "id": 12,    "name": "Dogs"  },  "name": "REX",  "photoUrls": [    "http://photosofrex/rex_rage.jpg"  ],  "tags": [    {      "id": 1,      "name": "ferocious sales"    }  ]}

and if you wanted to squeeze all chains of contiguous white space to single blank chars then this:

shopt -s extglob
printf -v pet_to_add "${fmt//+([[:space:]])/ }" "${@}"

would produce this output:

$ ./tst.sh
{ "id": 151, "category": { "id": 12, "name": "Dogs" }, "name": "REX", "photoUrls": [ "http://photosofrex/rex_rage.jpg" ], "tags": [ { "id": 1, "name": "ferocious sales" } ] }

The point being you can trivially massage the format however you like before populating pet_to_add without impacting the data (and vice-versa) because we've decoupled the data from the format.

3
  • grump re: potential for names that require escaping to cause syntactically invalid output (Rex "The Floof" f/e) Commented May 24 at 18:47
  • @CharlesDuffy yeah, I was in 2 minds about posting it but Kusalandas answer is very complicated/cumbersome and had already been rejected by the OP before I got here and SottoVoces answer leaves the format and the data tightly coupled and has a UUOC and unnecessary command substitution sub-process spawned so this seemed like the best option for the problem the OP described (unless a more palatable json-parser-based answer is posted), especially if the OP is in control of the input as I suspect they are, and a good approach for similar non-JSON problems.
    – Ed Morton
    Commented May 24 at 23:27
  • If problematic chars such as double quotes are possible in the data the OP can always add a test for them at the top of the function and bail out if detected. I added a simple loop to detect "s, the OP can massage to suit. I would be interested in seeing a concise json-parser-based solution of course.
    – Ed Morton
    Commented May 24 at 23:33

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .