To debug what causes the divergence of EM-SLAM algorithm, I Separated E-step and M-step. To prove M-step is correct, every time I will feed a true Q-function from E-step to M-step. i.e. x=x_true, P=0. (Note in general, Q-function contains trajectory information and it is always estimated in E-step). The map I feed into Q-function is random and is not true map. The result shows the convergence of trajectory and map just after one iteration, which suggests M-step is correct.
Before debugging on E-step, I think the divergence issue might also related to odometry. This is also demonstrated in paper "EM-SLAM algorithm with Visual/Inertial application". As a result, I changed the robot's orientation noise, σ_θ=0, to indicate that the robot can navigate its direction perfectly. Correspondingly, I changed initial covariance for robot's orientation to 0 to indicate that the orientation will be fully tracked during operation. On the other hand, I keep the observation convariance matrix Q as unchanged.
Attached is a plot of RSS of map with respect to iterations as well as a simple demo