I am working on a flask server which accepts .log file POSTS (basically just a text file). These files contain data resulting from invoking the command line smartctl command
smartctl -a /dev/sda
I want to parse this human readable information into JSON data (which still is pretty human readble lol) and send the resulting JSON data to a database.
I know a development branch on the smartctl github has the functionality to retrieve JSON data (version 6.7) and save it to a file with the argument
-j
However, I can't use this functionality as I need to use the stable version of smartctl.
The following is what I currently get as output:
{
" 3 Spin_Up_Time ": "027 200 199 021 Pre-fail Always - 983",
" 4 Start_Stop_Count ": "032 100 100 000 Old_age Always - 30",
" 5 Reallocated_Sector_Ct ": "033 200 200 140 Pre-fail Always - 0",
" 7 Seek_Error_Rate ": "02e 200 200 000 Old_age Always - 0",
" 9 Power_On_Hours ": "032 049 049 000 Old_age Always - 37855",
" 10 Spin_Retry_Count ": "032 100 253 000 Old_age Always - 0",
" 11 Calibration_Retry_Count ": "032 100 253 000 Old_age Always - 0",
" 12 Power_Cycle_Count ": "032 100 100 000 Old_age Always - 29",
"192 Power-Off_Retract_Count ": "032 200 200 000 Old_age Always - 28",
"193 Load_Cycle_Count ": "032 200 200 000 Old_age Always - 1",
"194 Temperature_Celsius ": "022 105 102 000 Old_age Always - 38",
"196 Reallocated_Event_Count ": "032 200 200 000 Old_age Always - 0",
"197 Current_Pending_Sector ": "032 200 200 000 Old_age Always - 0",
"198 Offline_Uncorrectable ": "030 200 200 000 Old_age Offline - 0",
"199 UDMA_CRC_Error_Count ": "032 200 200 000 Old_age Always - 0",
"200 Multi_Zone_Error_Rate ": "008 200 200 000 Old_age Offline - 0",
"ATA Version is": "ATA8-ACS (minor revision not indicated)",
"Add. Product Id": "DELL(tm)",
"Device Model": "WDC WD2502ABYS-18B7A0",
"Firmware Version": "02.03B05",
"LU WWN Device Id": "",
"Local Time is": "Mon Apr 23 10",
"Model Family": "Western Digital RE3 Serial ATA",
"Rotation Rate": "7200 rpm",
"SATA Version is": "SATA 2.5, 3.0 Gb/s",
"SMART overall-health self-assessment test result": "PASSED",
"SMART support is": "Enabled",
"Sector Size": "512 bytes logical/physical",
"Serial Number": "",
"User Capacity": "250.000.000.000 bytes [250 GB]"
}
The smart attributes mean the following:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
The device information is fine probably, however the smart attributes are probably not that useful in its current form; there are spaces, tabs and no keys for most of the values. I am not sure how to properly parse it. I need a few sub dicts to store the smart attributes, but I am quite lost.
I've noticed the file is fixed in length (or at least, the length has never been different for me) so I used this to cheaply get around with parsing it. My current code is as follows:
def parse_line(line): #for colons
splitted = line.split(':')
return splitted[0], splitted[1].strip()
def parse_line_smart(line): #for smart attributes
splitted = line.split("0x0")
return splitted[0], splitted[1].strip()
lines = file_body.split("\n")
value =""
key =""
my_data = {} # Empty dictionary object
for line in lines[4:22]: #device info
if ":" in line:
if line.startswith("Device is:"):
pass
else:
key, value = parse_line(line)
my_data[key] = value
for line in lines[61:77]: #Smart attributes
if "0x0" in line:
key, value = parse_line_smart(line)
my_data[key] = value
if 'Raw_Read_Error_Rate' in my_data:
pass # Parse some more
json_data = json.dumps(my_data)
print(json_data)
The device information probably needs a sub dict like:
"device" : {
"model_family" : "Western Digital RE3 Serial ATA",
"model_name" : "WDC WD2502ABYS-18B7A0",
},
I would like the smart attribute output to be something like:
"smart_attributes" : {
"table" : [
{
"id" : 1,
"name" : "Raw_Read_Error_Rate",
"value" : 200,
"worst" : 200,
"thresh" : 51,
"when_failed" : "",
"flags" : "0x002f",
"raw" : "0"
}
},
解决方案
暂无回答