§2024-05-10

On the 2024-05-09, our AC unit in the server room failed and caused our servers shutting down.

  1. monotor cpu temp
alexlai@opi58G:~/Downloads$ sudo apt install lm-sensors^C
alexlai@opi58G:~/Downloads$ sudo sensors
npu_thermal-virtual-0
Adapter: Virtual device
temp1:        +37.9°C  (crit = +115.0°C)

center_thermal-virtual-0
Adapter: Virtual device
temp1:        +37.9°C  (crit = +115.0°C)

bigcore1_thermal-virtual-0
Adapter: Virtual device
temp1:        +37.9°C  (crit = +115.0°C)

soc_thermal-virtual-0
Adapter: Virtual device
temp1:        +37.9°C  (crit = +115.0°C)

tcpm_source_psy_6_0022-i2c-6-22
Adapter: rk3x-i2c
in0:           0.00 V  (min =  +0.00 V, max =  +0.00 V)
curr1:         0.00 A  (max =  +0.00 A)

gpu_thermal-virtual-0
Adapter: Virtual device
temp1:        +37.9°C  (crit = +115.0°C)

littlecore_thermal-virtual-0
Adapter: Virtual device
temp1:        +37.9°C  (crit = +115.0°C)

bigcore0_thermal-virtual-0
Adapter: Virtual device
temp1:        +37.9°C  (crit = +115.0°C)
#!/bin/bash

# Get CPU temperature using sensors command
cpu_temp=$(sensors | grep -E 'temp1' | awk '{print $2}' | cut -d '+' -f2 | cut -d '.' -f1)

# Calculate average CPU temperature
num_cores=$(echo "$cpu_temp" | wc -l)
total_temp=0
while read -r temp; do
    total_temp=$((total_temp + temp))
done <<< "$cpu_temp"

avg_temp=$((total_temp / num_cores))

# Check if average CPU temperature is over 70 degrees
if [ "$avg_temp" -gt 70 ]; then
    # Email configuration
    recipient="your_email@example.com"
    subject="CPU Temperature Warning"
    message="CPU temperature is over 70°C. Average temperature: $avg_temp°C"

    # Send email
    echo "$message" | mail -s "$subject" "$recipient"
fi

[alexlai@h2nas01 ~]$ sudo pacman -Sy hddtemp lm_sensors
[alexlai@h2nas01 ~]$ sensors
acpitz-acpi-0
Adapter: ACPI interface
temp1:        +43.0°C  

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +43.0°C  (high = +105.0°C, crit = +105.0°C)
Core 0:        +43.0°C  (high = +105.0°C, crit = +105.0°C)
Core 1:        +43.0°C  (high = +105.0°C, crit = +105.0°C)
Core 2:        +42.0°C  (high = +105.0°C, crit = +105.0°C)
Core 3:        +42.0°C  (high = +105.0°C, crit = +105.0°C)
#!/bin/bash

# Get CPU temperature using sensors command
cpu_temp=$(sensors | grep 'Package' | awk '{print $4}' | cut -d '+' -f2 | cut -d '.' -f1)

# Check if CPU temperature is over 70 degrees
if [ "$cpu_temp" -gt 70 ]; then
    # Email configuration
    recipient="your_email@example.com"
    subject="CPU Temperature Warning"
    message="CPU temperature is over 70°C. Current temperature: $cpu_temp°C"

    # Send email
    echo "$message" | mail -s "$subject" "$recipient"
fi