In this tutorial, I will show you how to parse a string to JSON object in Logstash.
In short:
- To parse a multiline JSON data, use the
multiline
codec followed by thejson
filter. - To parse a JSON string in a field, use the
json
filter with thesource
andtarget
option.
Contents
- Inputting Multiline JSON Data in Logstash
- Converting JSON String to JSON Object from a Field
- Conclusion
Inputting Multiline JSON Data in Logstash
Suppose you have the following log file:
{"id":1,"title":"iPhone 9","description":"An apple mobile which is nothing like apple","price":549,"discountPercentage":12.96,"rating":4.69,"stock":94,"brand":"Apple","category":"smartphones","thumbnail":"/data/1/thumbnail.jpg","images":["/data/1/1.jpg","/data/1/2.jpg","/data/1/3.jpg","/data/1/4.jpg","/data/1/thumbnail.jpg"]} {"id":2,"title":"iPhone X","description":"SIM-Free, Model A19211 6.5-inch Super Retina HD display with OLED technology A12 Bionic chip with ...","price":899,"discountPercentage":17.94,"rating":4.44,"stock":34,"brand":"Apple","category":"smartphones","thumbnail":"/data/2/thumbnail.jpg","images":["/data/2/1.jpg","/data/2/2.jpg","/data/2/3.jpg","/data/2/thumbnail.jpg"]} {"id":3,"title":"Samsung Universe 9","description":"Samsung's new variant which goes beyond Galaxy to the Universe","price":1249,"discountPercentage":15.46,"rating":4.09,"stock":36,"brand":"Samsung","category":"smartphones","thumbnail":"/data/3/thumbnail.jpg","images":["/data/3/1.jpg"]} {"id":4,"title":"OPPOF19","description":"OPPO F19 is officially announced on April 2021.","price":280,"discountPercentage":17.91,"rating":4.3,"stock":123,"brand":"OPPO","category":"smartphones","thumbnail":"/data/4/thumbnail.jpg","images":["/data/4/1.jpg","/data/4/2.jpg","/data/4/3.jpg","/data/4/4.jpg","/data/4/thumbnail.jpg"]} {"id":5,"title":"Huawei P30","description":"Huawei’s re-badged P30 Pro New Edition was officially unveiled yesterday in Germany and now the device has made its way to the UK.","price":499,"discountPercentage":10.58,"rating":4.09,"stock":32,"brand":"Huawei","category":"smartphones","thumbnail":"/data/5/thumbnail.jpg","images":["/data/5/1.jpg","/data/5/2.jpg","/data/5/3.jpg"]} {"id":6,"title":"MacBook Pro","description":"MacBook Pro 2021 with mini-LED display may launch between September, November","price":1749,"discountPercentage":11.02,"rating":4.57,"stock":83,"brand":"Apple","category":"laptops","thumbnail":"/data/6/thumbnail.png","images":["/data/6/1.png","/data/6/2.jpg","/data/6/3.png","/data/6/4.jpg"]}
Remember to add a new line at the end of the file.
To convert the raw JSON string lines into JSON objects. Here are the steps:
- Input the log file using
multiline
codec. - Parse the
message
field as JSON using thejson
filter.
input { file { path => "/home/dminhvu/elastic/test.log" start_position => "beginning" # read from the beginning of the file sincedb_path => "/dev/null" # keep Logstash rereading the file when restarting codec => multiline { # multiline codec to read multiple lines as one event pattern => "\n" # each line is separated by a new line what => "next" # read the next part after the new line } } } filter { json { source => "message" # parse the message field as JSON } } output { file { path => "/home/dminhvu/elastic/output.log" codec => "json_lines" # write to JSON lines format } }
I have explained line by line in the config file. To help you understand more what to do after the input
section, I will focus on the filter
section.
Here, we use the json
filter to parse the message
field as JSON. The message
field is created when we input data into Logstash.
The output
section will produce the same multiline JSON data with some additional fields, you can use the jq
command to format the output:
jq -S . output.log > output_formatted.log
So after using the json
filter, all events will be parsed and each event should look like this:
{ "@timestamp": "2023-11-21T04:16:10.943722335Z", "@version": "1", "brand": "Apple", "category": "smartphones", "description": "An apple mobile which is nothing like apple", "discountPercentage": 12.96, "event": { "original": "{\\"id\\":1,\\"title\\":\\"iPhone 9\\",\\"description\\":\\"An apple mobile which is nothing like apple\\",\\"price\\":549,\\"discountPercentage\\":12.96,\\"rating\\":4.69,\\"stock\\":94,\\"brand\\":\\"Apple\\",\\"category\\":\\"smartphones\\",\\"thumbnail\\":\\"https:\/\/dminhvu.com/data/1/thumbnail.jpg\\",\\"images\\":[\\"https:\/\/dminhvu.com/data/1/1.jpg\\",\\"https:\/\/dminhvu.com/data/1/2.jpg\\",\\"https:\/\/dminhvu.com/data/1/3.jpg\\",\\"https:\/\/dminhvu.com/data/1/4.jpg\\",\\"https:\/\/dminhvu.com/data/1/thumbnail.jpg\\"]}" }, "host": { "name": "dminhvu" }, "id": 1, "images": [ "/data/1/1.jpg", "/data/1/2.jpg", "/data/1/3.jpg", "/data/1/4.jpg", "/data/1/thumbnail.jpg" ], "log": { "file": { "path": "/home/dminhvu/elastic/test.log" } }, "message": "{\\"id\\":1,\\"title\\":\\"iPhone 9\\",\\"description\\":\\"An apple mobile which is nothing like apple\\",\\"price\\":549,\\"discountPercentage\\":12.96,\\"rating\\":4.69,\\"stock\\":94,\\"brand\\":\\"Apple\\",\\"category\\":\\"smartphones\\",\\"thumbnail\\":\\"https:\/\/dminhvu.com/data/1/thumbnail.jpg\\",\\"images\\":[\\"https:\/\/dminhvu.com/data/1/1.jpg\\",\\"https:\/\/dminhvu.com/data/1/2.jpg\\",\\"https:\/\/dminhvu.com/data/1/3.jpg\\",\\"https:\/\/dminhvu.com/data/1/4.jpg\\",\\"https:\/\/dminhvu.com/data/1/thumbnail.jpg\\"]}", "price": 549, "rating": 4.69, "stock": 94, "thumbnail": "/data/1/thumbnail.jpg", "title": "iPhone 9" }
Now you can access the fields in the JSON object as usual with the syntax: %{field_name}
, or %{[field_name][sub_field_name]}
. To access array fields, you can use %{[field_name][index]}
.
For example, I will add a new field using the mutate filter based on some conditonal statements:
# ... filter { json { source => "message" } mutate { add_field => { "dummy_description" => "%{[title]} is of brand %{[brand]}" "first_image" => "%{[images][0]}" } } if [discountPercentage] > 0 { mutate { add_field => { "dummy_discount" => "This product has %{[discountPercentage]}% discount" } } } } # ...
Converting JSON String to JSON Object from a Field
Suppose you have another log file not fully in JSON format but only a part of it is JSON, like this one:
POST /some/endpoint body={"id":1,"name":"Minh Vu","age":21} headers={"Content-Type":"application/json"} POST /some/endpoint body={"id":2,"name":"WiseCode","age":22} headers={"Content-Type":"application/json"} POST /some/endpoint body={"id":3,"name":"Nhi Pham","age":23} headers={"Content-Type":"application/json"}
Now you want to parse the body
and headers
field as JSON object. You can keep the same input
and output
filter like above, but you need to change the filter
section:
- Use the grok filter to get necessary fields.
- Then use the
json
filter with specifiedsource
andtarget
(can be the same) to convert the string to JSON object.
input { file { path => "/home/dminhvu/elastic/test-2.log" start_position => "beginning" sincedb_path => "/dev/null" codec => multiline { pattern => "\n" what => "next" } } } filter { grok { # parse the message field with grok match => { "message" => "POST %{URIPATHPARAM:request} body=%{GREEDYDATA:body} headers=%{GREEDYDATA:headers}" } } json { # convert body to JSON object source => "body" target => "body" } json { # convert headers to JSON object source => "headers" target => "headers" } } output { file { path => "/home/dminhvu/elastic/output-2.log" codec => "json_lines" } }
After we get the output-2.log
, you can use the jq
command to format the output:
jq -S . output-2.log > output-2_formatted.log
One event of the output should look like this:
{ "@timestamp": "2023-11-21T05:16:12.021382978Z", "@version": "1", "body": { "age": 21, "id": 1, "name": "Minh Vu" }, "event": { "original": "POST /some/endpoint body={\\"id\\":1,\"name\\":\"Minh Vu\",\"age\\":21} headers={\\"Content-Type\\":\"application/json\"}" }, "headers": { "Content-Type": "application/json" }, "host": { "name": "dminhvu" }, "log": { "file": { "path": "/home/dminhvu/elastic/test-2.log" } }, "message": "POST /some/endpoint body={\\"id\\":1,\"name\\":\"Minh Vu\",\"age\\":21} headers={\\"Content-Type\\":\"application/json\"}", "request": "/some/endpoint" }
As you can see, the body
and headers
fields are now JSON objects.
Conclusion
In this tutorial, I showd you how to parse a string to JSON object in Logstash in 2 common cases:
- Inputting multiline JSON data: use the
multiline
codec followed by thejson
filter. - Converting JSON string in a field to JSON object: use the
json
filter with thesource
andtarget
option.
If you have any questions, please leave a comment below.
Comments
Be the first to comment!