I implemented the EM based SLAM algorithms. However, in a visual inertial system, we can to take care of other realistic issues besides the theoretical algorithm itself. It will be insightful to compare the optimization based algorithm, since it has been developed for a while.
The first thing is outlier detection and exclusion. In optimization based algorithm, people often use some loss functions, like Huber loss, to limit the effect from the outliers. In the EM based methods, I only conider the visual observations that are close enough to the prediction. The current method is very direction. I hope I can come up with some more smooth methods.
The second thing is how to use the result from visual odometry. In optimization based algorithm, the result from visual odometry is directly used as the initial estimate in the optimization process. However, there is no such role in the EM based. What I design is to take a weighted average between the dead reckoning and visual odometry.
Overall, the optimization based method still has a better accuracy, but the EM based methods are able to close the loop.