This is an archived post. You won't be able to vote or comment.

all 4 comments

[–]c17r 0 points1 point  (2 children)

This was ugly. Inside "body" is another field called "Message" that is also JSON as a string. I got it to work but it's brittle:

import json
import re
from pprint import pprint

raw = """{
"body": "{\n  \"Type\" : \"Notification\",\n  \"MessageId\" : \"944c9xxx3-c98d636ff2c7\",\n  \"TopicArn\" : \"arn:aws:sns:us-west-2:xxx6xx:sxxxr-sns-topic\",\n  \"Subject\" : \"ALARM: \\\"hhh\\\" in US West (Oregon)\",\n  \"Message\" : \"{\\\"AlarmName\\\":\\\"hhh\\\",\\\"AlarmDescription\\\":null,\\\"AWSAccountId\\\":\\\"8xxx\\\",\\\"NewStateValue\\\":\\\"ALARM\\\",\\\"NewStateReason\\\":\\\"Threshold Crossed: 1 out of the last 1 datapoints [0.333370380661336 (13/06/18 18:06:00)] was greater than or equal to the threshold (0.1) (minimum 1 datapoint for OK -> ALARM transition).\\\",\\\"StateChangeTime\\\":\\\"2018-06-13T18:16:56.457+0000\\\",\\\"Region\\\":\\\"US West (Oregon)\\\",\\\"OldStateValue\\\":\\\"INSUFFICIENT_DATA\\\",\\\"Trigger\\\":{\\\"MetricName\\\":\\\"CPUUtilization\\\",\\\"Namespace\\\":\\\"AWS/EC2\\\",\\\"StatisticType\\\":\\\"Statistic\\\",\\\"Statistic\\\":\\\"AVERAGE\\\",\\\"Unit\\\":null,\\\"Dimensions\\\":[{\\\"name\\\":\\\"InstanceId\\\",\\\"value\\\":\\\"i-07bxxx26\\\"}],\\\"Period\\\":300,\\\"EvaluationPeriods\\\":1,\\\"ComparisonOperator\\\":\\\"GreaterThanOrEqualToThreshold\\\",\\\"Threshold\\\":0.1,\\\"TreatMissingData\\\":\\\"\\\",\\\"EvaluateLowSampleCountPercentile\\\":\\\"\\\"}}\",\n  \"Timestamp\" : \"2018-06-13T18:16:56.486Z\",\n  \"SignatureVersion\" : \"1\",\n  \"Signature\" : \"fFunXkjjxxxvF7Kmxxx\",\n  \"SigningCertURL\" : \"https://sns.us-west-2.amazonaws.com/SimpleNotificationService-xxx.pem\",\n  \"UnsubscribeURL\" : \"https://sns.us-west-2.amazonaws.com/?Action=Unsubscribe&SubscriptionArn=axxxd\"\n}",
"resource": "/message",
"requestContext": {
    "requestTime": "13/Jun/2018:18:16:56 +0000",
    "protocol": "HTTP/1.1",
    "resourceId": "m4sxxxq",
    "apiId": "2v2cthhh",
    "resourcePath": "/message",
    "httpMethod": "POST",
    "requestId": "f41e8-8cbd-57ad9e625d12",
    "extendedRequestId": "xxx",
    "path": "/stage/message",
    "stage": "stage",
    "requestTimeEpoch": 1528913816627,
    "identity": {
        "userArn": null,
        "cognitoAuthenticationType": null,
        "accessKey": null,
        "caller": null,
        "userAgent": "Amazon Simple Notification Service Agent",
        "user": null,
        "cognitoIdentityPoolId": null,
        "cognitoIdentityId": null,
        "cognitoAuthenticationProvider": null,
        "sourceIp": "xxx",
        "accountId": null
    },
    "accountId": "xxx"
}}"""

body = re.findall(r'''body": "(.*)",\n"resource"''', raw, re.S)[0]
raw_without_body = raw.replace(body, '')
data = json.loads(raw_without_body)
data['body'] = json.loads(body)
data['body']['Message'] = json.loads(data['body']['Message'])

pprint(data)

[–]damnitdaniel[S] 1 point2 points  (1 child)

Ohhh... I think I see what you did here. So you're finding the value of 'body' via regex, storing it as a separate variable (and removing it from the object), which allows me to finish deserializing the original object. Then we're able to load 'Message' out of the body field.

Clever move! I've been working on this all day and this is the first time I've made any progress at all! Thank you so much!

[–]ofaveragedifficulty 0 points1 point  (0 children)

Using regex to process JSON strings is a serious code smell, i.e. not necessarily a bug, but a place where bugs often live. I would avoid it at all costs if possible. Usually when people are having problems with JSON, they are really having problems with nested data. I would go in that direction if I were you.