Using jq and curl to Wrangle JSON Arrays from the Terminal

In this tutorial, we use jq and curl to query a web service and retrieve JSON objects containing embedded arrays. We then invoke a command based on each element in the array rather than simply printing the values to the console. A special thanks to one of my readers, B. Anderson, who left a comment on my Consuming Web API JSON Data Using curl and jq post and provided this question for us to explore. Let's get started!

Article contents

Objective

Our goal is to query a web service containing metadata about weather stations in the area. The web service is located at https://thisdavej.com/api/weather-station.json and outputs the following JSON content:

{
  "name": "Station1",
  "location": "Centerville",
  "sensors": ["temperature", "humidity"]
}

We ultimately want to process the array of sensors embedded in the JSON object and fetch the current value of each sensor using a command that retrieves the values from a cloud-based time-series database. For this exercise, we'll create a Python script called get-sensor to simulate the retrieval of sensor values from the time-series database:

get-sensor
#!/usr/bin/env python3

import argparse
import random

parser = argparse.ArgumentParser()
parser.add_argument("station")
parser.add_argument("sensor")
args = parser.parse_args()

station = args.station
sensor = args.sensor


ranges = {
    "temperature": (50, 100),
    "humidity": (0, 100),
    "rainfall": (0, 3),
    "wind_speed": (0, 60),
}
default_range = (0, 100)
range = ranges.get(sensor, default_range)

ndigits = 2
current_value = round(random.uniform(*range), ndigits)

print(f"{station}.{sensor} current value: {current_value}")

After creating the above script, change the file access permissions to make the script executable:

$ chmod u+x get-sensor

The get-sensor requires two arguments (station and sensor) and returns the sensor value using a random number generator:

$ ./get-sensor Station1 temperature
Station1.temperature current value: 77.58

Initial Setup

To get started, check if curl is installed on your system since we will be using it fetch web data. It is installed by default on most systems.

$ which curl
/usr/bin/curl

I'm running Ubuntu on WSL at the moment and an executable path is returned so it's installed for me. If the which command returns nothing, you'll need to install it. For Debian-based distros such as Ubuntu, install it like this:

$ sudo apt install curl

We'll also be using jq, an awesome tool for slicing, dicing, and transforming JSON data. Let's check if we have it installed.

$ which jq

It looks like it needs to be installed so install it now:

$ sudo apt install jq

Note: We could also check if jq is installed and install it in one shot as follows:

$ which jq || sudo apt install jq

Next, create a shell script called get-station.sh to fetch the data using curl and output the result to standard output:

get-station.sh
#!/bin/bash

content=$(curl -sS "https://thisdavej.com/api/weather-station.json")
echo "$content"

curl parameters used:

  • -s Silent or quiet mode. Don't show progress meter or error messages. Makes Curl mute.
  • -S When used with -s it makes curl show an error message if it fails.

Consult the curl man page for more details or run the command syntax through explainshell, which is one of my favorite tools.

Change the script file access permissions to make the script executable:

$ chmod u+x get-station.sh

Finally, run the script to ensure it returns results:

$ ./get-station.sh
{
  "name": "Station1",
  "location": "Centerville",
  "sensors": ["temperature", "humidity"]
}

All looks good so we're ready to pipe the output through jq and do some amazing things!😄

Process one JSON object

Our web service (https://thisdavej.com/api/weather-station.json) returns one JSON object and we'll start using jq in this context. In the second section, we'll process an array of JSON objects representing multiple weather stations.

Let's start by instructing jq to pretty print the JSON output from the web service. If the JSON returned from the web service were more compact and not already formatted nicely, we would see a change. In this context, the JSON output will not be any different; however, we carry out the command for completeness.

$ ./get-station.sh| jq .
{
  "name": "Station1",
  "location": "Centerville",
  "sensors": [
    "temperature",
    "humidity"
  ]
}

The JSON returned from the web service was already formatted with spaces and new lines, but the jq would format it properly if the JSON was mashed together on one line, for example.

Next, let's use jq to filter the JSON and return only one field, the name of our weather station:

$ ./get-station.sh| jq '.name'
"Station1"

This looks good, but we'd like to return the string directly rather than returning it as a JSON string with quotes. We use the jq -r (raw output) command-line option to accomplish this goal:

$ ./get-station.sh| jq -r '.name'
Station1

Excellent. Let's fetch the array of sensors available for this weather station next:

$ ./get-station.sh | jq -r '.sensors'
[
  "temperature",
  "humidity"
]

This is a great start, but we want to output just the list of sensors rather than the square brackets denoting a JSON array so we can ultimately feed these values into our get-sensor script. We change the filter from .sensors to .sensors[] to return just the sensors available.

$ ./get-station.sh | jq -r '.sensors[]'
temperature
humidity

Our get-sensor script requires two parameters, the station name and the sensor; therefore, we need jq to filter and return both parameters.

$ ./get-station.sh | jq -r '. | "\(.name) \(.sensors[])"'
Station1 temperature
Station1 humidity

We're getting very close to victory. We can use xargs to build and execute commands from standard input. The -n2 option is included to instruct xargs to process 2 arguments at a time for our 2 parameters. We'll start by using xargs in conjunction with the echo command to ensure we are processing the arguments as expected.

./get-station.sh | jq -r '. | "\(.name) \(.sensors[])"'| xargs -n2 echo "$1 $2"
 Station1 temperature
 Station1 humidity

This looks good! We're ready to put everything together and process an array of values using jq and take action on each of the values by invoking our get-sensor script:

$ ./get-station.sh | jq -r '. | "\(.name) \(.sensors[])"'| xargs -n2 ./get-sensor $1 $2
Station1.temperature current value: 60.59
Station1.humidity current value: 41.21

It works! Let's take it one step further and process an array of JSON objects.

Process an array of JSON objects

Our second web service (https://thisdavej.com/api/weather-stations.json) returns an array of JSON objects containing metadata about multiple weather stations:

[
    {
        "name": "Station1",
        "location": "Centerville",
        "sensors": ["temperature", "humidity"]
    },
    {
        "name": "Station5",
        "location": "Anytown",
        "sensors": ["temperature", "humidity", "rainfall", "wind_speed"]
    }
]

Create a script called get-stations.sh to fetch the data using curl and output the result to the console:

get-stations.sh
#!/bin/bash

content=$(curl -sS "https://thisdavej.com/api/weather-stations.json")
echo "$content"

Change the script file access permissions to make the script executable:

$ chmod u+x get-stations.sh

Finally, run the script to ensure it writes the JSON content to standard output:

$ ./get-stations.sh
[
    {
        "name": "Station1",
        "location": "Centerville",
        "sensors": ["temperature", "humidity"]
    },
    {
        "name": "Station5",
        "location": "Anytown",
        "sensors": ["temperature", "humidity", "rainfall", "wind_speed"]
    }
]

Once again, we'll start by instructing jq to pretty print the JSON output from the web service (even though it won't look any different in this context since the JSON output is already well formatted):

$ ./get-stations.sh| jq .
[
    {
        "name": "Station1",
        "location": "Centerville",
        "sensors": ["temperature", "humidity"]
    },
    {
        "name": "Station5",
        "location": "Anytown",
        "sensors": ["temperature", "humidity", "rainfall", "wind_speed"]
    }
]

Next, let's use jq to filter the JSON and return only one field, the name of our weather station, for each weather station object in the array:

$ ./get-stations.sh | jq '.[] .name'
"Station1"
"Station5"

Ah yes, the double quotes. Let's use the jq -r (raw output) command-line option once again to eradicate the double quotes:

$ ./get-stations.sh | jq '.[] .name'
Station1
Station5

Excellent. Let's return a list of all items from the sensors array for both weather stations:

$ ./get-stations.sh | jq -r '.[] .sensors[]'
temperature
humidity
temperature
humidity
rainfall
wind_speed

This is a good start but recall that our get-sensor script requires both the station name and the sensor name as parameters. Let's return the station name also:

$ ./get-stations.sh | jq -r '.[] | "\(.name) \(.sensors[])"'
Station1 temperature
Station1 humidity
Station5 temperature
Station5 humidity
Station5 rainfall
Station5 wind_speed

We're getting close to victory! Let's use xargs once again and practice first with the echo command:

$ ./get-stations.sh | jq -r '.[] | "\(.name) \(.sensors[])"' | xargs -n2 echo "$1 $2"
 Station1 temperature
 Station1 humidity
 Station5 temperature
 Station5 humidity
 Station5 rainfall
 Station5 wind_speed

Finally, we bring it all together and process an array of values using jq and take action on each of the values by invoking our get-sensors script:

$ ./get-stations.sh | jq -r '.[] | "\(.name) \(.sensors[])"' | xargs -n2 ./get-sensor $1 $2
Station1.temperature current value: 71.62
Station1.humidity current value: 83.43
Station5.temperature current value: 96.1
Station5.humidity current value: 5.6
Station5.rainfall current value: 2.32
Station5.wind_speed current value: 11.35

Mission accomplished - jq is a very powerful tool!

Conclusion

The jq command is very useful for slicing, dicing, and transforming JSON data. We successfully utilized jq and curl to invoke a web service and retrieve JSON objects containing embedded arrays. We also moved beyond simply displaying the array values to the console and took action on each element. To learn more about jq, see my article on Consuming Web API JSON Data Using curl and jq as well as the official jq manual.

Follow @thisDaveJ (Dave Johnson) on Twitter to stay up to date with the latest tutorials and tech articles.

Additional articles

Consuming Web API JSON Data Using curl and jq Fetching, Filtering, and Sorting JSON APIs in Google Sheets: The Missing Functions How to Count Unique Items in JavaScript Arrays
Learn Handlebars with Node.js and Help Freddy's Fish Too

Last updated Jan 28 2020

Share

3 thoughts on “Using jq and curl to Wrangle JSON Arrays from the Terminal

Leave a Reply

Your email address will not be published. Required fields are marked *